[sql] A beginner's guide to SQL database design

Do you know a good source to learn how to design SQL solutions?

Beyond the basic language syntax, I'm looking for something to help me understand:

  1. What tables to build and how to link them
  2. How to design for different scales (small client APP to a huge distributed website)
  3. How to write effective / efficient / elegant SQL queries

This question is related to sql database database-design scalability

The answer is


Experience counts for a lot, but in terms of table design you can learn a lot from how ORMs like Hibernate and Grails operate to see why they do things. In addition:

  1. Keep different types of data separate - don't store addresses in your order table, link to an address in a separate addresses table, for example.

  2. I personally like having an integer or long surrogate key on each table (that holds data, not those that link different tables together, e,g., m:n relationships) that is the primary key.

  3. I also like having a created and modified timestamp column.

  4. Ensure that every column that you do "where column = val" in any query has an index. Maybe not the most perfect index in the world for the data type, but at least an index.

  5. Set up your foreign keys. Also set up ON DELETE and ON MODIFY rules where relevant, to either cascade or set null, depending on your object structure (so you only need to delete once at the 'head' of your object tree, and all that object's sub-objects get removed automatically).

  6. If you want to modularise your code, you might want to modularise your DB schema - e.g., this is the "customers" area, this is the "orders" area, and this is the "products" area, and use join/link tables between them, even if they're 1:n relations, and maybe duplicate the important information (i.e., duplicate the product name, code, price into your order_details table). Read up on normalisation.

  7. Someone else will recommend exactly the opposite for some or all of the above :p - never one true way to do some things eh!


These are questions which, in my opionion, requires different knowledge from different domains.

  1. You just can't know in advance "which" tables to build, you have to know the problem you have to solve and design the schema accordingly;
  2. This is a mix of database design decision and your database vendor custom capabilities (ie. you should check the documentation of your (r)dbms and eventually learn some "tips & tricks" for scaling), also the configuration of your dbms is crucial for scaling (replication, data partitioning and so on);
  3. again, almost every rdbms comes with a particular "dialect" of the SQL language, so if you want efficient queries you have to learn that particular dialect --btw. much probably write elegant query which are also efficient is a big deal: elegance and efficiency are frequently conflicting goals--

That said, maybe you want to read some books, personally I've used this book in my datbase university course (and found a decent one, but I've not read other books in this field, so my advice is to check out for some good books in database design).


These are questions which, in my opionion, requires different knowledge from different domains.

  1. You just can't know in advance "which" tables to build, you have to know the problem you have to solve and design the schema accordingly;
  2. This is a mix of database design decision and your database vendor custom capabilities (ie. you should check the documentation of your (r)dbms and eventually learn some "tips & tricks" for scaling), also the configuration of your dbms is crucial for scaling (replication, data partitioning and so on);
  3. again, almost every rdbms comes with a particular "dialect" of the SQL language, so if you want efficient queries you have to learn that particular dialect --btw. much probably write elegant query which are also efficient is a big deal: elegance and efficiency are frequently conflicting goals--

That said, maybe you want to read some books, personally I've used this book in my datbase university course (and found a decent one, but I've not read other books in this field, so my advice is to check out for some good books in database design).



It's been a while since I read it (so, I'm not sure how much of it is still relevant), but my recollection is that Joe Celko's SQL for Smarties book provides a lot of info on writing elegant, effective, and efficient queries.


Experience counts for a lot, but in terms of table design you can learn a lot from how ORMs like Hibernate and Grails operate to see why they do things. In addition:

  1. Keep different types of data separate - don't store addresses in your order table, link to an address in a separate addresses table, for example.

  2. I personally like having an integer or long surrogate key on each table (that holds data, not those that link different tables together, e,g., m:n relationships) that is the primary key.

  3. I also like having a created and modified timestamp column.

  4. Ensure that every column that you do "where column = val" in any query has an index. Maybe not the most perfect index in the world for the data type, but at least an index.

  5. Set up your foreign keys. Also set up ON DELETE and ON MODIFY rules where relevant, to either cascade or set null, depending on your object structure (so you only need to delete once at the 'head' of your object tree, and all that object's sub-objects get removed automatically).

  6. If you want to modularise your code, you might want to modularise your DB schema - e.g., this is the "customers" area, this is the "orders" area, and this is the "products" area, and use join/link tables between them, even if they're 1:n relations, and maybe duplicate the important information (i.e., duplicate the product name, code, price into your order_details table). Read up on normalisation.

  7. Someone else will recommend exactly the opposite for some or all of the above :p - never one true way to do some things eh!


I started out with this article

http://en.tekstenuitleg.net/articles/software/database-design-tutorial/intro.html

It's pretty concise compared to reading an entire book and it explains the basics of database design (normalization, types of relationships) very well.


It's been a while since I read it (so, I'm not sure how much of it is still relevant), but my recollection is that Joe Celko's SQL for Smarties book provides a lot of info on writing elegant, effective, and efficient queries.


Head First SQL is a great introduction.



Head First SQL is a great introduction.


These are questions which, in my opionion, requires different knowledge from different domains.

  1. You just can't know in advance "which" tables to build, you have to know the problem you have to solve and design the schema accordingly;
  2. This is a mix of database design decision and your database vendor custom capabilities (ie. you should check the documentation of your (r)dbms and eventually learn some "tips & tricks" for scaling), also the configuration of your dbms is crucial for scaling (replication, data partitioning and so on);
  3. again, almost every rdbms comes with a particular "dialect" of the SQL language, so if you want efficient queries you have to learn that particular dialect --btw. much probably write elegant query which are also efficient is a big deal: elegance and efficiency are frequently conflicting goals--

That said, maybe you want to read some books, personally I've used this book in my datbase university course (and found a decent one, but I've not read other books in this field, so my advice is to check out for some good books in database design).


Head First SQL is a great introduction.


It's been a while since I read it (so, I'm not sure how much of it is still relevant), but my recollection is that Joe Celko's SQL for Smarties book provides a lot of info on writing elegant, effective, and efficient queries.


Experience counts for a lot, but in terms of table design you can learn a lot from how ORMs like Hibernate and Grails operate to see why they do things. In addition:

  1. Keep different types of data separate - don't store addresses in your order table, link to an address in a separate addresses table, for example.

  2. I personally like having an integer or long surrogate key on each table (that holds data, not those that link different tables together, e,g., m:n relationships) that is the primary key.

  3. I also like having a created and modified timestamp column.

  4. Ensure that every column that you do "where column = val" in any query has an index. Maybe not the most perfect index in the world for the data type, but at least an index.

  5. Set up your foreign keys. Also set up ON DELETE and ON MODIFY rules where relevant, to either cascade or set null, depending on your object structure (so you only need to delete once at the 'head' of your object tree, and all that object's sub-objects get removed automatically).

  6. If you want to modularise your code, you might want to modularise your DB schema - e.g., this is the "customers" area, this is the "orders" area, and this is the "products" area, and use join/link tables between them, even if they're 1:n relations, and maybe duplicate the important information (i.e., duplicate the product name, code, price into your order_details table). Read up on normalisation.

  7. Someone else will recommend exactly the opposite for some or all of the above :p - never one true way to do some things eh!


Head First SQL is a great introduction.


I started out with this article

http://en.tekstenuitleg.net/articles/software/database-design-tutorial/intro.html

It's pretty concise compared to reading an entire book and it explains the basics of database design (normalization, types of relationships) very well.


These are questions which, in my opionion, requires different knowledge from different domains.

  1. You just can't know in advance "which" tables to build, you have to know the problem you have to solve and design the schema accordingly;
  2. This is a mix of database design decision and your database vendor custom capabilities (ie. you should check the documentation of your (r)dbms and eventually learn some "tips & tricks" for scaling), also the configuration of your dbms is crucial for scaling (replication, data partitioning and so on);
  3. again, almost every rdbms comes with a particular "dialect" of the SQL language, so if you want efficient queries you have to learn that particular dialect --btw. much probably write elegant query which are also efficient is a big deal: elegance and efficiency are frequently conflicting goals--

That said, maybe you want to read some books, personally I've used this book in my datbase university course (and found a decent one, but I've not read other books in this field, so my advice is to check out for some good books in database design).


Experience counts for a lot, but in terms of table design you can learn a lot from how ORMs like Hibernate and Grails operate to see why they do things. In addition:

  1. Keep different types of data separate - don't store addresses in your order table, link to an address in a separate addresses table, for example.

  2. I personally like having an integer or long surrogate key on each table (that holds data, not those that link different tables together, e,g., m:n relationships) that is the primary key.

  3. I also like having a created and modified timestamp column.

  4. Ensure that every column that you do "where column = val" in any query has an index. Maybe not the most perfect index in the world for the data type, but at least an index.

  5. Set up your foreign keys. Also set up ON DELETE and ON MODIFY rules where relevant, to either cascade or set null, depending on your object structure (so you only need to delete once at the 'head' of your object tree, and all that object's sub-objects get removed automatically).

  6. If you want to modularise your code, you might want to modularise your DB schema - e.g., this is the "customers" area, this is the "orders" area, and this is the "products" area, and use join/link tables between them, even if they're 1:n relations, and maybe duplicate the important information (i.e., duplicate the product name, code, price into your order_details table). Read up on normalisation.

  7. Someone else will recommend exactly the opposite for some or all of the above :p - never one true way to do some things eh!


Examples related to sql

Passing multiple values for same variable in stored procedure SQL permissions for roles Generic XSLT Search and Replace template Access And/Or exclusions Pyspark: Filter dataframe based on multiple conditions Subtracting 1 day from a timestamp date PYODBC--Data source name not found and no default driver specified select rows in sql with latest date for each ID repeated multiple times ALTER TABLE DROP COLUMN failed because one or more objects access this column Create Local SQL Server database

Examples related to database

Implement specialization in ER diagram phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' cannot be loaded Room - Schema export directory is not provided to the annotation processor so we cannot export the schema SQL Query Where Date = Today Minus 7 Days MySQL Error: : 'Access denied for user 'root'@'localhost' SQL Server date format yyyymmdd How to create a foreign key in phpmyadmin WooCommerce: Finding the products in database TypeError: tuple indices must be integers, not str

Examples related to database-design

What are OLTP and OLAP. What is the difference between them? How to create a new schema/new user in Oracle Database 11g? What are the lengths of Location Coordinates, latitude and longitude? cannot connect to pc-name\SQLEXPRESS SQL ON DELETE CASCADE, Which Way Does the Deletion Occur? What are the best practices for using a GUID as a primary key, specifically regarding performance? "Prevent saving changes that require the table to be re-created" negative effects Difference between scaling horizontally and vertically for databases Using SQL LOADER in Oracle to import CSV file What is cardinality in Databases?

Examples related to scalability

Difference between scaling horizontally and vertically for databases Best way to store chat messages in a database? Does Django scale? A beginner's guide to SQL database design What is considered a good response time for a dynamic, personalized web application?