Whenever I design a database, I always wonder if there is a best way of naming an item in my database. Quite often I ask myself the following questions:
Are there any recommended guidelines out there for naming items in a database?
Late answer here, but in short:
(1) What you must do. There are very few things that you must do a certain way, every time, but there are a few.
(2) What you should probably do.
(3) What you should consider.
Here's a link that offers a few choices. I was searching for a simple spec I could follow rather than having to rely on a partially defined one.
Very late to the party but I still wanted to add my two cents about column prefixes
There seem to be two main arguments for using the table_column (or tableColumn) naming standard for columns, both based on the fact that the column name itself will be unique across your whole database:
1) You do not have to specify table names and/or column aliases in your queries all the time
2) You can easily search your whole code for the column name
I think both arguments are flawed. The solution for both problems without using prefixes is easy. Here's my proposal:
Always use the table name in your SQL. E.g., always use table.column instead of column.
It obviously solves 2) as you can now just search for table.column instead of table_column.
But I can hear you scream, how does it solve 1)? It was exactly about avoiding this. Yes, it was, but the solution was horribly flawed. Why? Well, the prefix solution boils down to:
To avoid having to specify table.column when there's ambiguity, you name all your columns table_column!
But this means you will from now on ALWAYS have to write the column name every time you specify a column. But if you have to do that anyways, what's the benefit over always explicitly writing table.column? Exactly, there is no benefit, it's the exact same number of characters to type.
edit: yes, I am aware that naming the columns with the prefix enforces the correct usage whereas my approach relies on the programmers
Ok, since we're weighing in with opinion:
I believe that table names should be plural. Tables are a collection (a table) of entities. Each row represents a single entity, and the table represents the collection. So I would call a table of Person entities People (or Persons, whatever takes your fancy).
For those who like to see singular "entity names" in queries, that's what I would use table aliases for:
SELECT person.Name FROM People person
A bit like LINQ's "from person in people select person.Name".
As for 2, 3 and 4, I agree with @Lars.
I think the best answer to each of those questions would be given by you and your team. It's far more important to have a naming convention then how exactly the naming convention is.
As there's no right answer to that, you should take some time (but not too much) and choose your own conventions and - here's the important part - stick to it.
Of course it's good to seek some information about standards on that, which is what you're asking, but don't get anxious or worried about the number of different answers you might get: choose the one that seems better for you.
Just in case, here are my answers:
I work in a database support team with three DBAs and our considered options are:
We use singular names for tables. Tables tend to be prefixed with the name of the system (or its acronym). This is useful if the system complex as you can change the prefix to group the tables together logically (ie. reg_customer, reg_booking and regadmin_limits).
For fields we'd expect field names to be include the prefix/acryonm of the table (i.e. cust_address1) and we also prefer the use of a standard set of suffixes ( _id for the PK, _cd for "code", _nm for "name", _nb for "number", _dt for "Date").
The name of the Foriegn key field should be the same as the Primary key field.
SELECT cust_nm, cust_add1, booking_dt FROM reg_customer INNER JOIN reg_booking ON reg_customer.cust_id = reg_booking.cust_id
When developing a new project, I'd recommend you write out all the preferred entity names, prefixes and acronyms and give this document to your developers. Then, when they decide to create a new table, they can refer to the document rather than "guess" what the table and fields should be called.
I know this is late to the game, and the question has been answered very well already, but I want to offer my opinion on #3 regarding the prefixing of column names.
All columns should be named with a prefix that is unique to the table they are defined in.
E.g. Given tables "customer" and "address", let's go with prefixes of "cust" and "addr", respectively. "customer" would have "cust_id", "cust_name", etc. in it. "address" would have "addr_id", "addr_cust_id" (FK back to customer), "addr_street", etc. in it.
When I was first presented with this standard, I was dead-set against it; I hated the idea. I couldn't stand the idea of all that extra typing and redundancy. Now I've had enough experience with it that I'd never go back.
The result of doing this is that all of the columns in your database schema are unique. There is one major benefit to this, which trumps all arguments against it (in my opinion, of course):
You can search your entire code base and reliably find every line of code that touches a particular column.
The benefit from #1 is incredibly huge. I can deprecate a column and know exactly what files need to be updated before the column can safely be removed from the schema. I can change the meaning of a column and know exactly what code needs to be refactored. Or I can simply tell if data from a column is even being used in a particular portion of the system. I can't count the number of times this has turned a potentially huge project into a simple one, nor the amount of hours we've saved in development work.
Another, relatively minor benefit to it is that you only have to use table-aliases when you do a self join:
SELECT cust_id, cust_name, addr_street, addr_city, addr_state FROM customer INNER JOIN address ON addr_cust_id = cust_id WHERE cust_name LIKE 'J%';
Essential Database Naming Conventions (and Style) (click here for more detailed description)
table names choose short, unambiguous names, using no more than one or two words distinguish tables easily facilitates the naming of unique field names as well as lookup and linking tables give tables singular names, never plural (update: i still agree with the reasons given for this convention, but most people really like plural table names, so i’ve softened my stance)... follow the link above please
Table names singular. Let's say you were modelling a realtionship between someone and their address. For example, if you are reading a datamodel would you prefer 'each person may live at 0,1 or many address.' or 'each people may live at 0,1 or many addresses.' I think its easier to pluralise address, rather than have to rephrase people as person. Plus collective nouns are quite often dissimlar to the singular version.
Ok. Thats my $0.02
Table Name: It should be singular, as it is a singular entity representing a real world object and not objects, which is singlular.
Column Name: It should be singular only then it conveys that it will hold an atomic value and will confirm to the normalization theory. If however, there are n number of same type of properties, then they should be suffixed with 1, 2, ..., n, etc.
Prefixing Tables / Columns: It is a huge topic, will discuss later.
Casing: It should be Camel case
My friend, Patrick Karcher, I request you to please not write anything which may be offensive to somebody, as you wrote, "•Further, foreign keys must be named consistently in different tables. It should be legal to beat up someone who does not do this.". I have never done this mistake my friend Patrick, but I am writing generally. What if they together plan to beat you for this? :)
SELECT UserID, FirstName, MiddleInitial, LastName FROM Users ORDER BY LastName
I hear the argument all the time that whether or not a table is pluralized is all a matter of personal taste and there is no best practice. I don't believe that is true, especially as a programmer as opposed to a DBA. As far as I am aware, there are no legitimate reasons to pluralize a table name other than "It just makes sense to me because it's a collection of objects," while there are legitimate gains in code by having singular table names. For example:
It avoids bugs and mistakes caused by plural ambiguities. Programmers aren't exactly known for their spelling expertise, and pluralizing some words are confusing. For example, does the plural word end in 'es' or just 's'? Is it persons or people? When you work on a project with large teams, this can become an issue. For example, an instance where a team member uses the incorrect method to pluralize a table he creates. By the time I interact with this table, it is used all over in code I don't have access to or would take too long to fix. The result is I have to remember to spell the table wrong every time I use it. Something very similar to this happened to me. The easier you can make it for every member of the team to consistently and easily use the exact, correct table names without errors or having to look up table names all the time, the better. The singular version is much easier to handle in a team environment.
If you use the singular version of a table name AND prefix the primary key with the table name, you now have the advantage of easily determining a table name from a primary key or vice versa via code alone. You can be given a variable with a table name in it, concatenate "Id" to the end, and you now have the primary key of the table via code, without having to do an additional query. Or you can cut off "Id" from the end of a primary key to determine a table name via code. If you use "id" without a table name for the primary key, then you cannot via code determine the table name from the primary key. In addition, most people who pluralize table names and prefix PK columns with the table name use the singular version of the table name in the PK (for example statuses and status_id), making it impossible to do this at all.
If you make table names singular, you can have them match the class names they represent. Once again, this can simplify code and allow you to do really neat things, like instantiating a class by having nothing but the table name. It also just makes your code more consistent, which leads to...
If you make the table name singular, it makes your naming scheme consistent, organized, and easy to maintain in every location. You know that in every instance in your code, whether it's in a column name, as a class name, or as the table name, it's the same exact name. This allows you to do global searches to see everywhere that data is used. When you pluralize a table name, there will be cases where you will use the singular version of that table name (the class it turns into, in the primary key). It just makes sense to not have some instances where your data is referred to as plural and some instances singular.
To sum it up, if you pluralize your table names you are losing all sorts of advantages in making your code smarter and easier to handle. There may even be cases where you have to have lookup tables/arrays to convert your table names to object or local code names you could have avoided. Singular table names, though perhaps feeling a little weird at first, offer significant advantages over pluralized names and I believe are best practice.
I'm also in favour of a ISO/IEC 11179 style naming convention, noting they are guidelines rather than being prescriptive.
"Tables are Collections of Entities, and follow Collection naming guidelines. Ideally, a collective name is used: eg., Personnel. Plural is also correct: Employees. Incorrect names include: Employee, tblEmployee, and EmployeeTable."
As always, there are exceptions to rules e.g. a table which always has exactly one row may be better with a singular name e.g. a config table. And consistency is of utmost importance: check whether you shop has a convention and, if so, follow it; if you don't like it then do a business case to have it changed rather than being the lone ranger.
Naming conventions allow the development team to design discovereability and maintainability at the heart of the project.
A good naming convention takes time to evolve but once it’s in place it allows the team to move forward with a common language. A good naming convention grows organically with the project. A good naming convention easily copes with changes during the longest and most important phase of the software lifecycle - service management in production.
Here are my answers:
Naming is hard but in every organisation there is someone who can name things and in every software team there should be someone who takes responsibility for namings standards and ensures that naming issues like sec_id, sec_value and security_id get resolved early before they get baked into the project.
So what are the basic tenets of a good naming convention and standards: -
Should table names be plural?
Never. The arguments for it being a collection make sense, but you never know what the table is going to contain (0,1 or many items). Plural rules make the naming unnecessarily complicated. 1 House, 2 houses, mouse vs mice, person vs people, and we haven't even looked at any other languages.
Update person set property = 'value' acts on each person in the table.
Select * from person where person.name = 'Greg' returns a collection/rowset of person rows.
Should column names be singular?
Usually, yes, except where you are breaking normalisation rules.
Should I prefix tables or columns?
Mostly a platform preference. We prefer to prefix columns with the table name. We don't prefix tables, but we do prefix views (v_) and stored_procedures (sp_ or f_ (function)). That helps people who want to try to upday v_person.age which is actually a calculated field in a view (which can't be UPDATEd anyway).
It is also a great way to avoid keyword collision (delivery.from breaks, but delivery_from does not).
It does make the code more verbose, but often aids in readability.
bob = new person()
bob.person_name = 'Bob'
bob.person_dob = '1958-12-21'
... is very readable and explicit. This can get out of hand though:
indicates a relationship between customer and the customer_type table, indicates the primary key on the customer_type table (customer_type_id) and if you ever see 'customer_customer_type_id' whilst debugging a query, you know instantly where it is from (customer table).
or where you have a M-M relationship between customer_type and customer_category (only certain types are available to certain categories)
... is a little (!) on the long side.
Should I use any case in naming items? Yes - lower case :), with underscores. These are very readable and cross platform. Together with 3 above it also makes sense.
Most of these are preferences though. - As long as you are consistent, it should be predictable for anyone that has to read it.
Table names should always be singular, because they represent a set of objects. As you say herd to designate a group of sheep, or flock do designate a group of birds. No need for plural. When a table name is composition of two names and naming convention is in plural it becomes hard to know if the plural name should be the first word or second word or both. It’s the logic – Object.instance, not objects.instance. Or TableName.column, not TableNames.column(s). Microsoft SQL is not case sensitive, it’s easier to read table names, if upper case letters are used, to separate table or column names when they are composed of two or more names.
--Example SQL CREATE TABLE D001_Students ( StudentID INTEGER CONSTRAINT nnD001_STID NOT NULL, ChristianName NVARCHAR(255) CONSTRAINT nnD001_CHNA NOT NULL, Surname NVARCHAR(255) CONSTRAINT nnD001_SURN NOT NULL, CONSTRAINT pkD001 PRIMARY KEY(StudentID) ); CREATE INDEX idxD001_STID on D001_Students; CREATE TABLE D002_Classes ( ClassID INTEGER CONSTRAINT nnD002_CLID NOT NULL, StudentID INTEGER CONSTRAINT nnD002_STID NOT NULL, ClassName NVARCHAR(255) CONSTRAINT nnD002_CLNA NOT NULL, CONSTRAINT pkD001 PRIMARY KEY(ClassID, StudentID), CONSTRAINT fkD001_STID FOREIGN KEY(StudentID) REFERENCES D001_Students(StudentID) ); CREATE INDEX idxD002_CLID on D002_Classes; CREATE VIEW V001_StudentClasses ( SELECT D001.ChristianName, D001.Surname, D002.ClassName FROM D001_Students D001 INNER JOIN D002_Classes D002 ON D001.StudentID = D002.StudentID );
These are the conventions I was taught, but you should adapt to whatever you developement hose uses.
In my opinion:
However, like it has been mentioned, any convention is better than no convention. No matter how you choose to do it, document it so that future modifications follow the same conventions.
My opinions on these are:
1) No, table names should be singular.
While it appears to make sense for the simple selection (
select * from Orders) it makes less sense for the OO equivalent (
Orders x = new Orders).
A table in a DB is really the set of that entity, it makes more sense once you're using set-logic:
select Orders.* from Orders inner join Products on Orders.Key = Products.Key
That last line, the actual logic of the join, looks confusing with plural table names.
I'm not sure about always using an alias (as Matt suggests) clears that up.
2) They should be singular as they only hold 1 property
3) Never, if the column name is ambiguous (as above where they both have a column called [Key]) the name of the table (or its alias) can distinguish them well enough. You want queries to be quick to type and simple - prefixes add unnecessary complexity.
4) Whatever you want, I'd suggest CapitalCase
I don't think there's one set of absolute guidelines on any of these.
As long as whatever you pick is consistent across the application or DB I don't think it really matters.