[database] What is the most efficient way to store tags in a database?

I am implementing a tagging system on my website similar to one stackoverflow uses, my question is - what is the most effective way to store tags so that they may be searched and filtered?

My idea is this:

Table: Items
Columns: Item_ID, Title, Content

Table: Tags
Columns: Title, Item_ID

Is this too slow? Is there a better way?

This question is related to database database-design tags tagging

The answer is


Actually I believe de-normalising the tags table might be a better way forward, depending on scale.

This way, the tags table simply has tagid, itemid, tagname.

You'll get duplicate tagnames, but it makes adding/removing/editing tags for specific items MUCH more simple. You don't have to create a new tag, remove the allocation of the old one and re-allocate a new one, you just edit the tagname.

For displaying a list of tags, you simply use DISTINCT or GROUP BY, and of course you can count how many times a tag is used easily, too.


I'd suggest using intermediary third table for storing tags<=>items associations, since we have many-to-many relations between tags and items, i.e. one item can be associated with multiple tags and one tag can be associated with multiple items. HTH, Valve.


Actually I believe de-normalising the tags table might be a better way forward, depending on scale.

This way, the tags table simply has tagid, itemid, tagname.

You'll get duplicate tagnames, but it makes adding/removing/editing tags for specific items MUCH more simple. You don't have to create a new tag, remove the allocation of the old one and re-allocate a new one, you just edit the tagname.

For displaying a list of tags, you simply use DISTINCT or GROUP BY, and of course you can count how many times a tag is used easily, too.


Items should have an "ID" field, and Tags should have an "ID" field (Primary Key, Clustered).

Then make an intermediate table of ItemID/TagID and put the "Perfect Index" on there.


I'd suggest using intermediary third table for storing tags<=>items associations, since we have many-to-many relations between tags and items, i.e. one item can be associated with multiple tags and one tag can be associated with multiple items. HTH, Valve.


You can't really talk about slowness based on the data you provided in a question. And I don't think you should even worry too much about performance at this stage of developement. It's called premature optimization.

However, I'd suggest that you'd include Tag_ID column in the Tags table. It's usually a good practice that every table has an ID column.


You can't really talk about slowness based on the data you provided in a question. And I don't think you should even worry too much about performance at this stage of developement. It's called premature optimization.

However, I'd suggest that you'd include Tag_ID column in the Tags table. It's usually a good practice that every table has an ID column.


Items should have an "ID" field, and Tags should have an "ID" field (Primary Key, Clustered).

Then make an intermediate table of ItemID/TagID and put the "Perfect Index" on there.


If space is going to be an issue, have a 3rd table Tags(Tag_Id, Title) to store the text for the tag and then change your Tags table to be (Tag_Id, Item_Id). Those two values should provide a unique composite primary key as well.


I'd suggest using intermediary third table for storing tags<=>items associations, since we have many-to-many relations between tags and items, i.e. one item can be associated with multiple tags and one tag can be associated with multiple items. HTH, Valve.


If you don't mind using a bit of non-standard stuff, Postgres version 9.4 and up has an option of storing a record of type JSON text array.

Your schema would be:

Table: Items
Columns: Item_ID:int, Title:text, Content:text

Table: Tags
Columns: Item_ID:int, Tag_Title:text[]

For more info, see this excellent post by Josh Berkus: http://www.databasesoup.com/2015/01/tag-all-things.html

There are more various options compared thoroughly for performance and the one suggested above is the best overall.


Items should have an "ID" field, and Tags should have an "ID" field (Primary Key, Clustered).

Then make an intermediate table of ItemID/TagID and put the "Perfect Index" on there.


I'd suggest using intermediary third table for storing tags<=>items associations, since we have many-to-many relations between tags and items, i.e. one item can be associated with multiple tags and one tag can be associated with multiple items. HTH, Valve.


If space is going to be an issue, have a 3rd table Tags(Tag_Id, Title) to store the text for the tag and then change your Tags table to be (Tag_Id, Item_Id). Those two values should provide a unique composite primary key as well.


Actually I believe de-normalising the tags table might be a better way forward, depending on scale.

This way, the tags table simply has tagid, itemid, tagname.

You'll get duplicate tagnames, but it makes adding/removing/editing tags for specific items MUCH more simple. You don't have to create a new tag, remove the allocation of the old one and re-allocate a new one, you just edit the tagname.

For displaying a list of tags, you simply use DISTINCT or GROUP BY, and of course you can count how many times a tag is used easily, too.


Items should have an "ID" field, and Tags should have an "ID" field (Primary Key, Clustered).

Then make an intermediate table of ItemID/TagID and put the "Perfect Index" on there.


If space is going to be an issue, have a 3rd table Tags(Tag_Id, Title) to store the text for the tag and then change your Tags table to be (Tag_Id, Item_Id). Those two values should provide a unique composite primary key as well.


If you don't mind using a bit of non-standard stuff, Postgres version 9.4 and up has an option of storing a record of type JSON text array.

Your schema would be:

Table: Items
Columns: Item_ID:int, Title:text, Content:text

Table: Tags
Columns: Item_ID:int, Tag_Title:text[]

For more info, see this excellent post by Josh Berkus: http://www.databasesoup.com/2015/01/tag-all-things.html

There are more various options compared thoroughly for performance and the one suggested above is the best overall.


Examples related to database

Implement specialization in ER diagram phpMyAdmin - Error > Incorrect format parameter? Authentication plugin 'caching_sha2_password' cannot be loaded Room - Schema export directory is not provided to the annotation processor so we cannot export the schema SQL Query Where Date = Today Minus 7 Days MySQL Error: : 'Access denied for user 'root'@'localhost' SQL Server date format yyyymmdd How to create a foreign key in phpmyadmin WooCommerce: Finding the products in database TypeError: tuple indices must be integers, not str

Examples related to database-design

What are OLTP and OLAP. What is the difference between them? How to create a new schema/new user in Oracle Database 11g? What are the lengths of Location Coordinates, latitude and longitude? cannot connect to pc-name\SQLEXPRESS SQL ON DELETE CASCADE, Which Way Does the Deletion Occur? What are the best practices for using a GUID as a primary key, specifically regarding performance? "Prevent saving changes that require the table to be re-created" negative effects Difference between scaling horizontally and vertically for databases Using SQL LOADER in Oracle to import CSV file What is cardinality in Databases?

Examples related to tags

How to set image name in Dockerfile? What is initial scale, user-scalable, minimum-scale, maximum-scale attribute in meta tag? How to create named and latest tag in Docker? Excel data validation with suggestions/autocomplete How do you revert to a specific tag in Git? HTML tag <a> want to add both href and onclick working Instagram API to fetch pictures with specific hashtags How to apply Hovering on html area tag? How to get current formatted date dd/mm/yyyy in Javascript and append it to an input How to leave space in HTML

Examples related to tagging

How do you revert to a specific tag in Git? What is the most efficient way to store tags in a database? Recommended SQL database design for tags or tagging