[sql] Storing sex (gender) in database

I want to store a user's gender in a database with as little (size/performance) cost as possible.

So far, 3 scenarios come to mind

  1. Int - aligned with Enum in code (1 = Male, 2 = Female, 3 = ...)
  2. char(1) - Store m, f or another single character identifier
  3. Bit (boolean) - is there an appropriate field name for this option?

The reason I ask is because of this answer which mentions that chars are smaller than booleans.

I should clarify that I'm using MS SQL 2008, which DOES in fact have the bit datatype.

This question is related to sql database-design

The answer is


I would go with Option 3 but multiple NON NULLABLE bit columns instead of one. IsMale (1=Yes / 0=No) IsFemale (1=Yes / 0=No)

if requried: IsUnknownGender (1=Yes / 0=No) and so on...

This makes for easy reading of the definitions, easy extensibility, easy programmability, no possibility of using values outside the domain and no requirement of a second lookup table+FK or CHECK constraints to lock down the values.

EDIT: Correction, you do need at least one constraint to ensure the set flags are valid.


I use char 'f', 'm' and 'u' because I surmise the gender from name, voice and conversation, and sometimes don't know the gender. The final determination is their opinion.

It really depends how well you know the person and whether your criteria is physical form or personal identity. A psychologist might need additional options - cross to female, cross to male, trans to female, trans to male, hermaphrodite and undecided. With 9 options, not clearly defined by a single character, I might go with Hugo's advice of tiny integer.


In medicine there are four genders: male, female, indeterminate, and unknown. You mightn't need all four but you certainly need 1, 2, and 4. It's not appropriate to have a default value for this datatype. Even less to treat it as a Boolean with 'is' and 'isn't' states.


An Int (or TinyInt) aligned to an Enum field would be my methodology.

First, if you have a single bit field in a database, the row will still use a full byte, so as far as space savings, it only pays off if you have multiple bit fields.

Second, strings/chars have a "magic value" feel to them, regardless of how obvious they may seem at design time. Not to mention, it lets people store just about any value they would not necessarily map to anything obvious.

Third, a numeric value is much easier (and better practice) to create a lookup table for, in order to enforce referential integrity, and can correlate 1-to-1 with an enum, so there is parity in storing the value in memory within the application or in the database.


Option 3 is your best bet, but not all DB engines have a "bit" type. If you don't have a bit, then TinyINT would be your best bet.


CREATE TABLE Admission (
    Rno INT PRIMARY KEY AUTO_INCREMENT,
    Name VARCHAR(25) NOT NULL,
    Gender ENUM('M','F'),
    Boolean_Valu boolean,
    Dob Date,
    Fees numeric(7,2) NOT NULL
);




insert into Admission (Name,Gender,Boolean_Valu,Dob,Fees)values('Raj','M',true,'1990-07-12',50000);
insert into Admission (Name,Gender,Boolean_Valu,Dob,Fees)values('Rani','F',false,'1994-05-10',15000);
select * from admission;

enter link description here


There is already an ISO standard for this; no need to invent your own scheme:

http://en.wikipedia.org/wiki/ISO_5218

Per the standard, the column should be called "Sex" and the 'closest' data type would be tinyint with a CHECK constraint or lookup table as appropriate.