I am not able to understand what Range / primary key is here in the docs on Working with Tables and Data in DynamoDB
How does it work?
What do they mean by "unordered hash index on the hash attribute and a sorted range index on the range attribute"?
This question is related to
hash
amazon-dynamodb
primary-key
database
nosql
A well-explained answer is already given by @mkobit, but I will add a big picture of the range key and hash key.
In a simple words range + hash key = composite primary key
CoreComponents of Dynamodb
A primary key is consists of a hash key and an optional range key. Hash key is used to select the DynamoDB partition. Partitions are parts of the table data. Range keys are used to sort the items in the partition, if they exist.
So both have a different purpose and together help to do complex query.
In the above example hashkey1 can have multiple n-range.
Another example of range and hashkey is game, userA(hashkey)
can play Ngame(range)
The Music table described in Tables, Items, and Attributes is an example of a table with a composite primary key (Artist and SongTitle). You can access any item in the Music table directly, if you provide the Artist and SongTitle values for that item.
A composite primary key gives you additional flexibility when querying data. For example, if you provide only the value for Artist, DynamoDB retrieves all of the songs by that artist. To retrieve only a subset of songs by a particular artist, you can provide a value for Artist along with a range of values for SongTitle.
https://www.slideshare.net/InfoQ/amazon-dynamodb-design-patterns-best-practices https://www.slideshare.net/AmazonWebServices/awsome-day-2016-module-4-databases-amazon-dynamodb-and-amazon-rds https://ceyhunozgun.blogspot.com/2017/04/implementing-object-persistence-with-dynamodb.html
@vnr you can retrieve all the sort keys associated with a partition key by just using the query using partion key. No need of scan. The point here is partition key is compulsory in a query . Sort key are used only to get range of data
As the whole thing is mixing up let's look at it function and code to simulate what it means consicely
The only way to get a row is via primary key
getRow(pk: PrimaryKey): Row
Primary key data structure can be this:
// If you decide your primary key is just the partition key.
class PrimaryKey(partitionKey: String)
// and in thids case
getRow(somePartitionKey): Row
However you can decide your primary key is partition key + sort key in this case:
// if you decide your primary key is partition key + sort key
class PrimaryKey(partitionKey: String, sortKey: String)
getRow(partitionKey, sortKey): Row
getMultipleRows(partitionKey): Row[]
So the bottom line:
Decided that your primary key is partition key only? get single row by partition key.
Decided that your primary key is partition key + sort key? 2.1 Get single row by (partition key, sort key) or get range of rows by (partition key)
In either way you get a single row by primary key the only question is if you defined that primary key to be partition key only or partition key + sort key
Building blocks are:
Think of Item as a row and of KV Attribute as cells in that row.
You can do (2) only if you decided that your PK is composed of (HashKey, SortKey).
More visually as its complex, the way I see it:
+----------------------------------------------------------------------------------+
|Table |
|+------------------------------------------------------------------------------+ |
||Item | |
||+-----------+ +-----------+ +-----------+ +-----------+ | |
|||primaryKey | |kv attr | |kv attr ...| |kv attr ...| | |
||+-----------+ +-----------+ +-----------+ +-----------+ | |
|+------------------------------------------------------------------------------+ |
|+------------------------------------------------------------------------------+ |
||Item | |
||+-----------+ +-----------+ +-----------+ +-----------+ +-----------+ | |
|||primaryKey | |kv attr | |kv attr ...| |kv attr ...| |kv attr ...| | |
||+-----------+ +-----------+ +-----------+ +-----------+ +-----------+ | |
|+------------------------------------------------------------------------------+ |
| |
+----------------------------------------------------------------------------------+
+----------------------------------------------------------------------------------+
|1. Always get item by PrimaryKey |
|2. PK is (Hash,RangeKey), great get MULTIPLE Items by Hash, filter/sort by range |
|3. PK is HashKey: just get a SINGLE ITEM by hashKey |
| +--------------------------+|
| +---------------+ |getByPK => getBy(1 ||
| +-----------+ +>|(HashKey,Range)|--->|hashKey, > < or startWith ||
| +->|Composite |-+ +---------------+ |of rangeKeys) ||
| | +-----------+ +--------------------------+|
|+-----------+ | |
||PrimaryKey |-+ |
|+-----------+ | +--------------------------+|
| | +-----------+ +---------------+ |getByPK => get by specific||
| +->|HashType |-->|get one item |--->|hashKey ||
| +-----------+ +---------------+ | ||
| +--------------------------+|
+----------------------------------------------------------------------------------+
So what is happening above. Notice the following observations. As we said our data belongs to (Table, Item, KVAttribute). Then Every Item has a primary key. Now the way you compose that primary key is meaningful into how you can access the data.
If you decide that your PrimaryKey is simply a hash key then great you can get a single item out of it. If you decide however that your primary key is hashKey + SortKey then you could also do a range query on your primary key because you will get your items by (HashKey + SomeRangeFunction(on range key)). So you can get multiple items with your primary key query.
Note: I did not refer to secondary indexes.
Source: Stackoverflow.com