Tuesday, June 30, 2015

Use Tuple types to model data

To use sqlite.swift with Swift 2, see the Create a Data Access Layer with SQLite.Swift and Swift 2 post.

A couple of weeks back I wrote a blog post on how to create a data access layer using SQLite.swift.  You can see the post here.  In one of the projects that I am currently working on, I wanted to create the data access layer as I described in that post because I know how well it worked from previous projects.  For this particular project however there are several tables that only have a couple of columns which means I needed to create a number of classes in the data model layer that contained only a few of properties. 

It seemed like a waste to create all those classes which only had a couple of properties so I started wondering if I could use tuples instead of classes to model my data.  I was unsure how modeling data with tuples would work but I decided to give it a try.  I found out that they really worked well. 

In this post I will explain how we could use tuples to model our data and then I will show how I would replace the data model classes from my previous post with tuples.  If you have not read my previous post about creating a data access layer with SQLite.swift, you can read it here.

What are tuple types

A Tuple type groups zero or more values into a single compound type.  Tuples can contain values of different types which allows us to group related data of different types together.  There are many uses for tuples and one of the most common is to use them as a return type from a function when we need to return multiple values.  The Void return type is actually a typealias for a tuple with no values.

Using tuple types to model our data

When I say that I want to model our data what I am referring to is grouping related data together is a single structure.  As an example if I want to create a class name PersonClass to model the information for a person the class may look like this:

class PersonClass {
    var firstName: String
    var lastName: String
    var age: Int
   
    init(firstName: String, lastName: String, age: Int) {
        self.firstName = firstName
        self.lastName = lastName
        self.age = age
    }
}

In this class we define three properties for our person and we also create an initializer that will set these properties.  This is quite a bit of code to simply store data.  If we wanted to create a typealias for a tuple named PersonTuple which models the same data, it would look like this:

typealias PersonTuple = (firstName: String, lastName: String, age: Int)

As we can see it takes a lot less code to create our PersonTuple tuple as compared to the PersonClass class.  Creating an instance of the PersonClass class is very similar to creating a PersonTuple variable.  The following code demonstrates this:

var pClass = PersonClass(firstName: "Jon", lastName: "Hoffman", age: 46)
var pTuple: PersonTuple = (firstName:"Jon", lastName:"Hoffman", age: 46)

We could actually shorten the tuple definition as shown in the following code but I prefer naming the parameters to show what they mean (really personal preference).

var pTuple: PersonTuple = ("Jon", "Hoffman", 46)

We can pass tuple types within our code just like we would pass an instance of a class or structure.  In the following code we show how to write a function that accepts an instance of the PersonClass as the only parameter and a function that accepts a PersonTuple as the only parameter.

func acceptPersonClass(person: PersonClass) {
    println("\(person.firstName) \(person.lastName)")
}
func acceptPersonTuple(person: PersonTuple) {
    println("\(person.firstName) \(person.lastName)")
}

When we replace data modeling classes or structures with tuples our code can become much more compact and in some ways easier to understand however we do lose the ability to add functionality to our data model types.  Some, including myself, would argue that losing the ability to add functions to our data model types is a good thing because if we truly want to separate our data model from our business logic we should not be embedding business logic in our data model classes.

Replacing data modeling classes with tuples

Tuples weren’t meant to be used as replacements for classes or structures however if we create a typealias of a tuple type it can very easily be used to model our data. In the Create a Data Access Layer using SQLite.swift post we had two classes in our data model layer:  Team.swift and Player.swift.  The code for these two classes is shown below:

Team.swift
import Foundation

class Team {
   
    var teamId: Int64?
    var city: String?
    var nickName: String?
    var abbreviation: String?
   
    init(teamId: Int64, city: String, nickName: String, abbreviation: String) {
       
        self.teamId = teamId
        self.city = city
        self.nickName = nickName
        self.abbreviation = abbreviation
    }
}

Player.swift
import Foundation

class Player {
   
    var playerId: Int64?
    var firstName: String?
    var lastName: String?
    var number: Int?
    var teamId: Int64?
    var position: Positions?
   
    init (playerId: Int64, firstName: String, lastName: String, number: Int, teamId: Int64, position: Positions?) {
        self.playerId = playerId
        self.firstName = firstName
        self.lastName = lastName
        self.number = number
        self.teamId = teamId
        self.position = position
    }
}

Now to replace these to classes with tuples, all I really need to do is to create typealiases instead.  For this I crate a DataModel.swift class that contains the following code:

typealias Team = (teamId: Int64?, city: String?, nickName: String?, abbreviation: String?)

typealias Player = (playerId: Int64?, firstName: String?, lastName: String?, number: Int?, teamId: Int64?, position: Positions?)

I then deleted the Team and Player classes and was able to build/run the project as it was before.  We could additionally remove the typealias names from the find() and findAll() methods of the data helper classes.  As an example, the findAll() method from the PlayerDataHelper class looks like this:

static func findAll() -> [T]? {
        var retArray = [T]()
        for item in table {
            retArray.append(Player(playerId: item[playerId], firstName: item[firstName], lastName: item[lastName], number: item[number], teamId: item[teamId], position: Positions(rawValue: item[position])))
        }
        return retArray
    }
We could change this function to this:

static func findAll() -> [T]? {
        var retArray = [T]()
        for item in table {
            retArray.append((playerId: item[playerId], firstName: item[firstName], lastName: item[lastName], number: item[number], teamId: item[teamId], position: Positions(rawValue: item[position])))
        }
        return retArray
    }

However I think the code reads better if we keep the typealias name in the code.

I created a github site which contains the sample project for the “Create a data access layer using SQLite.swift with the changes made in this post.  The repository is located here (https://github.com/hoffmanjon/SQLiteDataAccessLayer).  I would like to know what others think of using tuples to model data.  Please leave comments below.

If you would like to learn more about Swift, you can check out my book on amazon.com or on packtpub.com.


19 comments:

  1. Will you be updating your book for Swift 2 .0 at some point? If so, will I need to buy the book again or will the update be free?

    ReplyDelete
    Replies
    1. My book just came out so I cannot commit to updating it at this time but I am thinking about writing another book that would expand on the material in Mastering Swift and also cover Swift 2.0.
      I will continue writing in this blog and will soon begin writing about the new features in Swift 2.0.

      Delete
    2. Thanks for replying. Although I already own several books on swift and ios/osx programming, I went ahead and bought yours. I find that it's helpful to get different views, examples, and explanations from different developers. I'm not a software engineer, just an old hacker who wire wrapped his first computer in 1976 and has been tinkering ever since!

      Delete
    3. Thank you for purchasing my book, I hope you find it very useful. If you have any questions, please do not hesitate to drop me a note.

      Delete
  2. At first I liked this approach.....it simplified the tables and eliminated some boilerplate code. But as I thought more about it, I decided I don't like it for the following reasons:
    1) Tuples are value-types, not reference types.....so if you pass one of these around, it will get copied and then any change/save work could be inconsistent.....most seriously, the autoincrement ID after an insert....

    2) My dream is to use protocol extensions and generics to eliminate the "helper" class boilerplate and tuples are not going to work in this regard....you can't easily put meta-data in them, and they don't cary (or inherit) any methods...

    For example, in the "Class" example, I had added meta-data for required fields as follows:

    static let requiredFields = ["name", "position"]

    the methods inherited from the protocol extension (Swift 2) could use this meta-data to automatically handle the mandatory-fields checking before Insert or Update.....

    But you can't do any of this from the Tuple structure.....
    I'm afraid it's going to be very limited going forward......I think a better approach is to find a way to protocol-extend the class so you don't need to create the default initializer. But such fancy generics are currently over my head....

    ReplyDelete
    Replies
    1. Honestly, there is no true perfect solution however, from my experience when you mix business logic with your data access class you will probably end up regretting it. That is why tuples work so well in this approach because we cannot add any logic to our data access layer if we use tuples. The data access layer should simply transfer data from the data storage layer to our business logic layer.
      If you ask me if you should always follow that rule (removing all logic from the data access layer), I would refer you to my first sentence that there is not perfect solution however I would ask what benefit you get from putting business logic in your data access layer.
      When we do separate the business logic from our data access layer, from my experience, we get projects that are easier to maintain as requirements change.
      Just my thoughts based on my past experience.

      Delete
  3. I get your point about business logic and I totally agree with you. And when you say "data access class", I'm assuming you mean your "helper" classes since the tuples perform no IO at all. As a former DBA, I make a distinction between application business logic and data-integrity logic (consistency) in the DB. The relational model is exactly that...a model with rules and constraints unto itself. And things like required fields, value-validation, and basic I/O (implicit joins, etc) lies closer to the model than it does to any other area of code. In fact, you don't want other parts of the code to have to think about those things....and (to my mind anyway), when the programmer has to keep two separate entities in mind (one for getter/setter and one for operations upon the former), it just seems to complicate things.....it even seems to violate OOP ideas as I understand them.....especially since all of the column defs need to be mirrored in both objects. Perhaps I'm too new to Swift/Xcode, but I'm having trouble understanding the benefit of that separation....can you point me at further reading to help me better understand this perspective so I don't paint myself into a corner? Thanks for sharing your experience.

    ReplyDelete
    Replies
    1. Keep in mind that at the data access layer we should be using the data modeling classes to simply transfer the information from the data storage layer to the business logic layer and vise versa. Lets look at a very basic example.
      Lets say that we had a table named "cars" and in that table we had a column named GasTankSize. Then in the data access layer we put logic that said the gas tank size must be between 10 and 50 gallons because every car should have a gas tank between those sizes.
      in a year from now, we get a new requirement to add electric cars to our database however they do not have gas tanks. In the future we may have lots of different ways to power our cars therefore for each type of cars we would have to have different logic. If we tried to implement all that logic at the data access layer we would have code that would be very hard to maintain.
      It would be much easier to maintain if, for our example, we had a CarModel type (that would be our tuples) which simple transferred the information to/from the cars table into a class or struct that implemented a CarProtocol. Types of classes/structs that would implement the CarsProtocol could be GasCar, ElectricCar, DieselTruck... and each one of those types would implement their own logic about would would be stored in the cars table and how the data was interpreted.
      Does that explain why I personally avoid pointing business logic at the data access layer. I believe that also explains your next question about value types.

      Delete
    2. I get your point although the deeper point in your example is that the underlying table-structure (or column type) was wrong to begin with.....and that seems a much bigger problem than how to abstract field-validation. But I don't really see how this explains away the problem of me passing around tuples of record-entities and getting lots of different copies so it's hard to merge / track the mods against several of them....

      Delete
    3. In my car example we would use the tuple to retrieve the data from the data storage layer and then initiate the types that conform to the CarProtocol using those tuples. Maybe some code will show it better (once again sorry for the lack of indentations).
      Lets say that we have a tuple as shown in the following code (very simplified normally we would have more information):

      typealias CarModel = (name: String, gasTankSize: Int)

      now we define our car protocol like this:

      protocol Car {
      var name: String {get set}
      init(model: CarModel)
      }

      We could then implement the GasCar class which conforms to the CarProtocol like this:

      class GasCar: Car {
      var name: String
      var gasTankSize: Int

      required init(model: CarModel) {
      name = model.name
      gasTankSize = model.gasTankSize
      }
      }

      and the ElectricCar class that also conforms to the CarProtocol like this (notice we do not use the GasTankSize in this class, we could use it and default it to 0 if we wanted too):

      class ElectricCar: Car {
      var name: String

      required init(model: CarModel) {
      name = model.name
      }
      }

      In this example we can use Protocol Extensions to implement any functionality that is the same amount the types that conform to the CarProtocol. We would then pass around the types that implement the CarProtocol around our application and not the tuples.
      Now whenever we need a new type of car we simple create a new type that conforms to the car protocol. Within those types we could implement helper methods that retrieve the data from the data storage layer or we could create a separate class that does that. All of the logic for each car type would be stored within the individual types.
      does the example explain it better?

      Delete
    4. This example does explain it much better, but we've now reverted back to using Classes for all the heavy work and I think we now have 3 places to maintain column-names and types.....tuple, ModelClass and HelperClass. This type of formality makes total sense to me for those rare DB tables that model some ever-growing and evolving entity....anticipating future code changes. But this much redundancy totally kills me for the 90% of other (likely) non-evolving DB tables that just need basic CRUD & basic integrity enforcement operations.......it feels like rigid adherence to OOP at the expense of pragmatism. There are 1 or 2 tables in each app where I see the wisdom in such careful factoring.....and because of the anticipated future requirements, I would probably store them as JSON or XML with versioning.....not discrete columns because then your flexibility is hindered by the model anyway. But when you have many other tables, I don't want to have to declare & maintain those columns in 3 places......I'd shoot myself. Perhaps when I have lots more experience, I'll see the benefit of such universal rigor.

      Delete
    5. At this point you are assuming that at the business logic layer our entities (classes in my example) have to match the entities at the data storage layer. This very rarely is the case. To expand the car example further we may have database tables for list the various types of radios, tires, seats.... that our car may have however all that information would be stored in a single car entity at the business logic layer. this entity could look like this:

      class car {
      var name: String
      var GasTankSize: Int
      var tireSize: Int
      var tireMake: String
      ....
      }

      or you could store the tuples that contain the information directly from the database

      class car {
      var name: String
      var gasTankSize: Int
      var tires: TiresType
      var radio: RadioType
      }

      I would recommend the later because that keeps the information grouped together which will make it easier save back to the database later. This is also where you will see the advantages of value types of reference types.
      Lets say that we retrieved retrieved some seats from the database and put that information in six different car entities. Then for one of the cars we wanted to change the color of the seats from brown to red. If the seat entity was a reference type, if we changed the color value in one of our cars, it would change the value for all of our cars however since tuples are value types we are able to change the color value (we would probably want to change the unique DB identifier to 0 so when we save it, it will create a new entry in the seats table.)
      You are correct that this adds additional code to your application. Designing the application as you have described will also work and work well. Your approach will also be fairly easy to maintain if you do not have any changes to your database. In my past experiences,it is very rare for the database to not change and most of the time we have pretty significant changes to the database structure during the lifetime of the application. With the approach that I use, it is a lot easier to make changes to the database without effecting large portions of our application.
      To me "Requirements Will Change" is one of the principles of application development therefore you should design your application to handle the changes that are coming even though you do not know what they are yet.

      Delete
  4. Any thoughts about the "value type" issue......I imagine passing my objects around so that generic functions can modify them. If I use tuples, it will always be a different instance getting modified at each transform......unless I declare every function arg as an "inout" which I guess would work.....but again, this imposes one other thing the programmer needs to think about and manage.......just seems to add risk to me....

    ReplyDelete
  5. I just bumped up to Swift 2 and I'm having trouble with the prior code. The Connection() method now throws....which means we have to "try"....which means that the init() method could possibly exit without initializing BBDB

    The compiler really doesn't like that......

    SQLiteDataStore { // singleton: holds connection to the DB
    static let sharedInstance = SQLiteDataStore()
    let BBDB: Connection // <--- problem is here

    but if you change BBDB to an optional, then you have to force-unwrap it every place it is used....

    What do you recommend to deal with this conundrum?

    ReplyDelete
    Replies
    1. sqlite.swift has a new branch for Swift 2 which you will need to download. In that branch it changes a lot. Their are examples in the code that comes down that demonstrates how to use it. Once Swift 2 and sqlite.swift is finalized, I will post another tutorial but here is the code that works for me with the latest builds:

      class SQLiteDataStore {
      static let sharedInstance = SQLiteDataStore()
      let DB: Connection?

      private init() {

      var path = "MyDB.sqlite"
      if let dirs : [String] = NSSearchPathForDirectoriesInDomains(NSSearchPathDirectory.DocumentDirectory, NSSearchPathDomainMask.AllDomainsMask, true) as [String] {

      let dir = dirs[0] as NSString
      path = dir.stringByAppendingPathComponent("MyDB.sqlite");
      }

      do {
      DB = try Connection(path)
      } catch _ {
      DB = nil
      }
      }
      }

      Hope that helps you

      Delete
    2. Cant figure out how to indent in these comments, sorry for the lack of indentation in the code

      Delete
    3. Yes this is helpful!! No worries about the indentation.....thanks for sharing your experience...

      Delete
  6. This comment has been removed by a blog administrator.

    ReplyDelete
  7. Thanks for the Article on Data Modeling This is very helpful please keep posting.

    ReplyDelete