The coding discussed here is about the identification of data and not the coding that means programming.
The simplest form of coding I guess would just be serial numbering, which is as easy as 1, 2, 3… Such coding would immediately give the reader an idea about how long was the record kept or how many records are there in the database. Fortunately, we are not limited to that. We can use coding such that by merely looking at it, those knowledgeable will immediately have an idea about more information.
The main reason for coding is the technology behind, mostly because relational database requires it. This blog is not about relational database though, so let’s focus on coding for now though, agree that we need it, being more efficient and accurate in storing and retrieving information using it.
To achieve such accuracy and efficiency, coding requires that the code is unique for each set of data in the entire storage. This means that a code should be used only once for a specific item or individual. For example in a store that sells vegetable, it would be confusing to use code #1 for both carrots and celery, right? So code #1 should be used only for carrots, #2 only for celery and so on.
The other requirement of coding is that it should be able to accommodate the number of records that it would manage. Let’s say for example that above store has only a maximum 100 kinds of vegetable. In this case, it will be safe to use only 3 letter code for the item.
Yet another requirement is that codes should be consistent. So if number is used, it would be better if all else uses numbers for consistency. And if 3 digits are used, it is expected that all records would use the same 3 digits.
Although numbers are very good to use for coding, it makes much more sense for humans to use letters to code. For example, it is easier to figure out that “ca” means Canada and “fr” means France, rather than “1” for Canada and “2” for France. I’d say then that the best way of coding is to use both letters and numbers.
Let’s take the scenario of say a school, in creating a code for each student. As mentioned above, we can just use serial numbering and we simply can have an idea about how many students there are in the school. Let’s say though that other than the count of students, we also want to have an idea about their studies. Let’s say in school, they have Accounting, Engineering and Health departments. Say there’s only a maximum 10 departments, it is safe to use just a single alphabet for its code. Following code then could be used:
- A – Accounting
- E – Engineering
- H – Health
So now, we can use the following code:
A-001 for Accounting student number 1;
E-002 for Engineering student number 2;
H-003 for Health student number 3, and so on.
At this point, we can decide whether we want to use a single serial number for the whole school or for each departments to have their own serial numbers. The code could then be A-001, E-001 and H-001, which is still valid as they are still unique even though they have the same serial number. At this point, the code makes more sense, right? Not only we de see how many students, we also know what they’re studying.
Using this coding though, we will have difficulty maintaining uniqueness as we will be limited to 999 students to each department for its entire business lifespan. Let’s fix this by adding say the year the student first enrolled, the code now will have the following scheme:
|YYYY –||D –||999|
Sample code: 2014-A-002.
Code looks good, eh? Let’s say though we want more info. The school has 3 semesters and we want to add said info into the code. Coding then could look like this:
|YYYY –||S –||D –||999|
Sample code: 2014-1-E-003.
Say instead of 1, 2 and 3 for the semesters, we want to use the following code:
- F – Fall
- W – Winter
- S – Summer
Sample code: 2014-F-A-004.
Still fine, right? One thing to consider though, when codes are sorted, it will show F, S then W, instead of 1, 2 and 3. Bit of confusion there as in this case, letters will not sort according to natural sequence. Oh well, we can use A, B and C for Fall, Winter and Summer for proper sequence and sorting but bit easier to remember F, W and S for Fall, Winter and Summer, right? It all depends on the organization to decide which code to use. Having programmer’s mentality, 1, 2 and 3 is my preference but others may have differences and it’s OK as long is pros and cons are known.
Just bit of a twist, since the code for year seemed too obvious, we can use for example the following:
Where the first letter stands for the century, A means 2000, B means 2100 and so on.
I hope that sheds a light on simple coding techniques. There are other more sophisticated ways such that codes could be verified by itself. Maybe later I could blog about that too. For now, suffice it to say that codes could be used to quickly provide few information.