I started by googling, and found this article which talks about mutex tables.
I have a table with ~14 million records. If I want to add more data in the same format, is there a way to ensure the record I want to insert does not already exist without using a pair of queries (ie, one query to check and one to insert is the result set is empty)?
Does a unique
constraint on a field guarantee the insert
will fail if it's already there?
It seems that with merely a constraint, when I issue the insert via php, the script croaks.
This question is related to
php
mysql
sql
primary-key
sql-insert
Solution:
INSERT INTO `table` (`value1`, `value2`)
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL
WHERE NOT EXISTS (SELECT * FROM `table`
WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1)
Explanation:
The innermost query
SELECT * FROM `table`
WHERE `value1`='stuff for value1' AND `value2`='stuff for value2' LIMIT 1
used as the WHERE NOT EXISTS
-condition detects if there already exists a row with the data to be inserted. After one row of this kind is found, the query may stop, hence the LIMIT 1
(micro-optimization, may be omitted).
The intermediate query
SELECT 'stuff for value1', 'stuff for value2' FROM DUAL
represents the values to be inserted. DUAL
refers to a special one row, one column table present by default in all Oracle databases (see https://en.wikipedia.org/wiki/DUAL_table). On a MySQL-Server version 5.7.26 I got a valid query when omitting FROM DUAL
, but older versions (like 5.5.60) seem to require the FROM
information. By using WHERE NOT EXISTS
the intermediate query returns an empty result set if the innermost query found matching data.
The outer query
INSERT INTO `table` (`value1`, `value2`)
inserts the data, if any is returned by the intermediate query.
REPLACE INTO `transcripts`
SET `ensembl_transcript_id` = 'ENSORGT00000000001',
`transcript_chrom_start` = 12345,
`transcript_chrom_end` = 12678;
If the record exists, it will be overwritten; if it does not yet exist, it will be created.
Any simple constraint should do the job, if an exception is acceptable. Examples :
Sorry is this seems deceptively simple. I know it looks bad confronted to the link you share with us. ;-(
But I neverleless give this answer, because it seem to fill your need. (If not, it may trigger your updating your requirements, which would be "a Good Thing"(TM) also).
Edited: If an insert would break the database unique constraint, an exception is throw at the database level, relayed by the driver. It will certainly stop your script, with a failure. It must be possible in PHP to adress that case ...
on duplicate key update, or insert ignore can be viable solutions with MySQL.
Example of on duplicate key update update based on mysql.com
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;
Example of insert ignore based on mysql.com
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
{VALUES | VALUE} ({expr | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE
col_name=expr
[, col_name=expr] ... ]
Or:
INSERT [LOW_PRIORITY | DELAYED | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name
SET col_name={expr | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE
col_name=expr
[, col_name=expr] ... ]
Or:
INSERT [LOW_PRIORITY | HIGH_PRIORITY] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT ...
[ ON DUPLICATE KEY UPDATE
col_name=expr
[, col_name=expr] ... ]
Try the following:
IF (SELECT COUNT(*) FROM beta WHERE name = 'John' > 0)
UPDATE alfa SET c1=(SELECT id FROM beta WHERE name = 'John')
ELSE
BEGIN
INSERT INTO beta (name) VALUES ('John')
INSERT INTO alfa (c1) VALUES (LAST_INSERT_ID())
END
If you already have a unique or primary key, the other answers with either INSERT INTO ... ON DUPLICATE KEY UPDATE ...
or REPLACE INTO ...
should work fine (note that replace into deletes if exists and then inserts - thus does not partially update existing values).
But if you have the values for some_column_id
and some_type
, the combination of which are known to be unique. And you want to update some_value
if exists, or insert if not exists. And you want to do it in just one query (to avoid using a transaction). This might be a solution:
INSERT INTO my_table (id, some_column_id, some_type, some_value)
SELECT t.id, t.some_column_id, t.some_type, t.some_value
FROM (
SELECT id, some_column_id, some_type, some_value
FROM my_table
WHERE some_column_id = ? AND some_type = ?
UNION ALL
SELECT s.id, s.some_column_id, s.some_type, s.some_value
FROM (SELECT NULL AS id, ? AS some_column_id, ? AS some_type, ? AS some_value) AS s
) AS t
LIMIT 1
ON DUPLICATE KEY UPDATE
some_value = ?
Basically, the query executes this way (less complicated than it may look):
WHERE
clause match.s
), where the column values are explicitly given (s.id is NULL, so it will generate a new auto-increment identifier).s
is discarded (due to LIMIT 1 on table t
), and it will always trigger an ON DUPLICATE KEY
which will UPDATE
the some_value
column.s
).Note: Every table in a relational database should have at least a primary auto-increment id
column. If you don't have this, add it, even when you don't need it at first sight. It is definitely needed for this "trick".
There are several answers that cover how to solve this if you have a UNIQUE
index that you can check against with ON DUPLICATE KEY
or INSERT IGNORE
. That is not always the case, and as UNIQUE
has a length constraint (1000 bytes) you might not be able to change that. For example, I had to work with metadata in WordPress (wp_postmeta
).
I finally solved it with two queries:
UPDATE wp_postmeta SET meta_value = ? WHERE meta_key = ? AND post_id = ?;
INSERT INTO wp_postmeta (post_id, meta_key, meta_value) SELECT DISTINCT ?, ?, ? FROM wp_postmeta WHERE NOT EXISTS(SELECT * FROM wp_postmeta WHERE meta_key = ? AND post_id = ?);
Query 1 is a regular UPDATE
query with no effect when the dataset in question is not there. Query 2 is an INSERT
which depends on a NOT EXISTS
, i.e. the INSERT
is only executed when the dataset doesn't exist.
Here is a PHP function that will insert a row only if all the specified columns values don't already exist in the table.
If one of the columns differ, the row will be added.
If the table is empty, the row will be added.
If a row exists where all the specified columns have the specified values, the row won't be added.
function insert_unique($table, $vars)
{
if (count($vars)) {
$table = mysql_real_escape_string($table);
$vars = array_map('mysql_real_escape_string', $vars);
$req = "INSERT INTO `$table` (`". join('`, `', array_keys($vars)) ."`) ";
$req .= "SELECT '". join("', '", $vars) ."' FROM DUAL ";
$req .= "WHERE NOT EXISTS (SELECT 1 FROM `$table` WHERE ";
foreach ($vars AS $col => $val)
$req .= "`$col`='$val' AND ";
$req = substr($req, 0, -5) . ") LIMIT 1";
$res = mysql_query($req) OR die();
return mysql_insert_id();
}
return False;
}
Example usage :
<?php
insert_unique('mytable', array(
'mycolumn1' => 'myvalue1',
'mycolumn2' => 'myvalue2',
'mycolumn3' => 'myvalue3'
)
);
?>
Something worth noting is that INSERT IGNORE will still increment the primary key whether the statement was a success or not just like a normal INSERT would.
This will cause gaps in your primary keys that might make a programmer mentally unstable. Or if your application is poorly designed and depends on perfect incremental primary keys, it might become a headache.
Look into innodb_autoinc_lock_mode = 0
(server setting, and comes with a slight performance hit), or use a SELECT first to make sure your query will not fail (which also comes with a performance hit and extra code).
Source: Stackoverflow.com