Docker and Redis - a curious beginner's walkthrough

#Redis

The main goal of this tutorial is to guide a newcomer to quickly set up a Redis Docker container and then get familiar with the basic Redis operations with scalar values and Hashes, building enough knowledge to build a simple object database.

1. Docker setup

1.1. Create a Dockerfile with the following content:

FROM redis

1.2. Build the image

% docker build -t redis-demo .

Hint: You can use --help directive to check what each flag means. Example:

% docker build --help

  -t, --tag list    Name and optionally a tag in the 'name:tag' format

1.3. Run the image

Once docker builds the redis-demo image, you can run a container of that image.

% docker run redis-demo

2. Playing with Docker and Redis in interactive mode

2.1. Use redis-cli to test the connection

% redis-cli
Could not connect to Redis at 127.0.0.1:6379: Connection refused
not connected> exit

The connection is refused because redis is running in a container whose ports aren’t exposed to the host.

2.2. Port Mapping

It’s necessary to use the flag -p to publish a container’s port to the host.

% docker run --help | grep "publish"
  -p, --publish list                   Publish a container's port(s) to the host

The argument -p needs to be passed in the format host_port:container_port. Example:

% docker run -p 6379:6379 redis-demo
% redis-cli
127.0.0.1:6379> ping
PONG
127.0.0.1:6379> exit

2.3. Publishing the container’s port to another host’s port

% docker run -p 9736:6379 redis-demo
% redis-cli ping
Could not connect to Redis at 127.0.0.1:6379: Connection refused

Redis-cli can’t connect because there’s no host service listening on the default port 6379. We can check --help to figure out how to connect the client to the specified port:

% redis-cli --help
redis-cli 6.2.6

Usage: redis-cli [OPTIONS] [cmd [arg [arg ...]]]
  -h <hostname>      Server hostname (default: 127.0.0.1).
  -p <port>          Server port (default: 6379).

% redis-cli -p 9736
127.0.0.1:9736> ping
PONG
127.0.0.1:9736> exit

2.4. Publishing an invalid container port

% docker run -p 6379:9736 redis-demo
% redis-cli
127.0.0.1:6379> ping
Error: Server closed the connection
127.0.0.1:6379> exit

In this case, there’s a binding at the host’s port 6379, so redis-cli is able to start and send the ping command. However, if there’s no service listening at the specified container’s port, then redis-cli will reports the connection is closed.

2.5. Pinging to a non-redis port

% redis-cli -p 4000
127.0.0.1:4000> ping
Error: Protocol error, got "H" as reply type byte

If redis-cli connects to a non-redis port (example: a web server), it will report protocol errors.

3. Getting familiar with Redis

3.1. Basic Redis operations and concepts

One interesting feature of redis-cli is the autosuggestions. In the example below, it shows the arguments available for the commmand SET (EX, PX, EXAT, PXAT, NX, XX, and GET).

% redis-cli
# suggested arguments to the command SET
127.0.0.1:6379> set key value [EX seconds|PX milliseconds|EXAT timestamp|PXAT milliseconds-timestamp|KEEPTTL] [NX|XX] [GET]

127.0.0.1:6379> set X 1
OK

Let’s now explore some basic stuff, like SET, GET, TYPE, TTL, EXPIRE, PERSIST, Conditional Assigments (NX, XX), GETSET (deprecated in favor to SET+GET), DEL, EXISTS, MSET and MGET.

127.0.0.1:6379> get X
1
127.0.0.1:6379> type X
string

# The key X was set with no TTL - the ttl command returns -1.
127.0.0.1:6379> ttl X
(integer) -1

# The key X will expire in 10 seconds
127.0.0.1:6379> set X 1 EX 10
OK

# 7 seconds remaining before the key X expires
127.0.0.1:6379> ttl X
(integer) 7

# The key still exists
127.0.0.1:6379> get X
"1"
# The key expired and was removed - the ttl command returns -2
127.0.0.1:6379> ttl X
(integer) -2
# Indeed the key was removed
127.0.0.1:6379> get X
(nil)

# Also possible setting a key first then mark it to expire later
127.0.0.1:6379> set X 1
OK
127.0.0.1:6379> expire X 5
(integer) 1
127.0.0.1:6379> ttl X
(integer) 3

# Removing TTL for a key
127.0.0.1:6379> set X 5 EX 60
OK
127.0.0.1:6379> ttl X
(integer) 57
127.0.0.1:6379> PERSIST X
(integer) 1
# TTL removed
127.0.0.1:6379> ttl X
(integer) -1

127.0.0.1:6379> set X 1
OK
# Set X only if X is not set
127.0.0.1:6379> set X 2 NX
(nil)
# X already existed and was not overriden by the last command
127.0.0.1:6379> get X
"1"
# Set X only if X is already set
127.0.0.1:6379> set X 2 XX
OK
#
127.0.0.1:6379> get X
"2"
# Set Y only if Y is already set
127.0.0.1:6379> set Y 3 XX
(nil)
# Y was not set
127.0.0.1:6379> get Y
(nil)

# Gets the current value and thereafter sets to a new value
127.0.0.1:6379> getset X 3
"2"
127.0.0.1:6379> get X
"3"

# GETSET is deprecated. The recommended command is:
127.0.0.1:6379> set X 3
OK
127.0.0.1:6379> SET X 4 GET
"3"
127.0.0.1:6379> GET X
"4"

# Returns 1 (exists) or 0 (doesn't exist)
127.0.0.1:6379> exists A
(integer) 1
127.0.0.1:6379> del A
(integer) 1
127.0.0.1:6379> get A
(nil)
127.0.0.1:6379> exists A
(integer) 0

# Sets multiple keys at once
127.0.0.1:6379> mset X 1 Y 2 Z 3
OK
# Retrieves multiple keys at once
127.0.0.1:6379> mget X Y Z
1) "1"
2) "2"
3) "3"
127.0.0.1:6379> get X
"1"
127.0.0.1:6379> get Y
"2"
127.0.0.1:6379> get Z
"3"

3.2. Redis Hashes

You can see in the last example above the command mset sets multiple keys (X, Y, Z) to multiple values (1, 2, 3) in a single command. Working with hashes using hmset looks very similar: the first argument is the hash key, and each subsequent pair is a key=value combination that compound the contents of the hash:

# Set multiple fields into a hash
127.0.0.1:6379> hmset fabio first_name Fabio last_name Miranda
OK

# Set single field into a hash
127.0.0.1:6379> hset fabio age 40
(integer) 1

# Set single field into a hash only if it doesn't exist
127.0.0.1:6379> hsetnx fabio first_name Fabbbbbio
(integer) 0
127.0.0.1:6379> hget fabio first_name
"Fabio"

# Get single field in a hash
127.0.0.1:6379> hget fabio first_name
1) "Fabio"
127.0.0.1:6379> hget fabio last_name
1) "Miranda"
127.0.0.1:6379> hget fabio age
1) "40"

# Get inexistent field
127.0.0.1:6379> hget fabio birthday
1) (nil)

# Check field existence
127.0.0.1:6379> hexists fabio birthday
(integer) 0
127.0.0.1:6379> hexists fabio first_name
(integer) 1

# Get multiple fields in a hash
127.0.0.1:6379> hmget fabio first_name last_name age
1) "Fabio"
2) "Miranda"
3) "40"

# Get all fields in a hash
127.0.0.1:6379> hgetall fabio
1) "first_name"
2) "Fabio"
3) "last_name"
4) "Miranda"
5) "age"
6) "40"

# Get all keys in a hash
127.0.0.1:6379> hkeys fabio
1) "first_name"
2) "last_name"
3) "age"

# Get all values in a hash
127.0.0.1:6379> hvals fabio
1) "Fabio"
2) "Miranda"
3) "40"

# Number of fields
127.0.0.1:6379> hlen fabio
(integer) 3

# Delete field - returns 1
127.0.0.1:6379> hdel fabio age
(integer) 1

# Deleting non-existing field - returns 0
127.0.0.1:6379> hdel fabio age
(integer) 0

# Check deleted field
127.0.0.1:6379> hexists fabio age
(integer) 0

# Retrieve a deleted field
127.0.0.1:6379> hget fabio age
(nil)

# Deleting the hash
127.0.0.1:6379> del fabio
(integer) 1
127.0.0.1:6379> hgetall fabio
(empty array)

# Automatic deletion after all fields are removed
127.0.0.1:6379> hset fabio name "Fabio Miranda"
(integer) 1
127.0.0.1:6379> hget fabio name
"Fabio Miranda"
127.0.0.1:6379> hdel fabio name
(integer) 1
127.0.0.1:6379> hgetall fabio
(empty array)

3.3. FLUSHALL/KEYS/SCAN/MATCH operations

SCAN is a cursor based iterator to get all the keys of the Redis database. While it’s possible to do the same using the KEYS command, the later may block the server for a long time, while scan doesn’t have this downside.

Before testing it, it’s recommended executing flushall to clear all the database keys.

127.0.0.1:6379> flushall
OK

The SCAN command notation is:

127.0.0.1:6379> scan cursor [MATCH pattern] [COUNT count] [TYPE type]

The scan starts when the cursor is 0, and it ends when the cursor returns to 0.

127.0.0.1:6379> scan 0
1) "0"
2) (empty array)

The first result is the cursor value returned by the last call. Given there are no keys, the cursor ends after the first call and returns an empty array.

Let’s set all the alphabet characters to their respective indexes and use scan to iterate:

127.0.0.1:6379> mset A 1 B 2 C 3 D 4 E 5 F 6 G 7 H 8 I 9 J 10 K 11 L 12 M 13 N 14 O 15 P 16 Q 17 R 18 S 19 T 20 U 21 V 22 W 23 X 24 Y 25 Z 26
OK

# List all the keys
127.0.0.1:6379> KEYS *
 1) "R"
 2) "C"
 3) "D"
 4) "O"
 5) "X"
 6) "J"
 7) "W"
 8) "N"
 9) "B"
10) "S"
11) "V"
12) "E"
13) "K"
14) "U"
15) "Y"
16) "A"
17) "F"
18) "M"
19) "Q"
20) "G"
21) "P"
22) "L"
23) "T"
24) "I"
25) "H"
26) "Z"

Now let’s iterate using the SCAN command. Subsequent calls use the last cursor value to move to the the next iteration, until the cursor value returns to 0. All the letters are returned after the iterations.

127.0.0.1:6379> scan 0
1) "22"
2)  1) "U"
    2) "M"
    3) "Q"
    4) "X"
    5) "J"
    6) "F"
    7) "E"
    8) "T"
    9) "P"
   10) "B"

127.0.0.1:6379> scan 22
1) "21"
2)  1) "I"
    2) "H"
    3) "R"
    4) "C"
    5) "D"
    6) "O"
    7) "Y"
    8) "G"
    9) "W"
   10) "N"

127.0.0.1:6379> scan 21
1) "0"
2) 1) "A"
   2) "V"
   3) "L"
   4) "S"
   5) "K"
   6) "Z"

Finally, we can use the command SCAN with a MATCH argument to obtain an array of keys based on pattern-matching criteria:

127.0.0.1:6379> scan 0 MATCH A
1) "22"
2) (empty array)
127.0.0.1:6379> scan 22 MATCH A
1) "11"
2) 1) "A"
127.0.0.1:6379> scan 11 MATCH A
1) "0"
2) (empty array)

Note the MATCH filter may return no elements in most iterations. It’s possible to use COUNT to let redis apply the pattern matching filter in a bigger size iteration:

# COUNT forces the command to do more scanning for the iteration
127.0.0.1:6379> scan 0 MATCH A COUNT 50
1) "0"
2) 1) "A"

4. A simple Redis database using HSCAN/MATCH

The examples so far demonstrated how to perform CRUD (Create/Read/Update/Delete) operations to store scalar values and hashes, as well as how to use SCAN to iterate over the data. If the keys are designed in a pre-defined schema, for instance in a format like user:ID, it’s possible to build a flexible database of hashes that applications can benefit from in many ways to easily store hashes representing data objects.

4.1. Fetching API and storing some data into Redis

The bash command below fetches some API data from a fake persons data generator, then extracts the name and birthday attributes from the response using the jq JSON processor, interpolating the name and birthday in a string that can be concatenated as hash fields to an HMSET operation.

redis-cli flushall

for i in {1..100} ; do \
  curl -s https://api.namefake.com/english-united-states | \
  jq -r '"name \"\(.name)\" birthday \(.birth_data)"' | \
  xargs redis-cli hmset "user:${i}" ; \
done

Let’s now play with our small users database:

127.0.0.1:6379> SCAN 0 COUNT 5
1) "8"
2) 1) "user:78"
   2) "user:31"
   3) "user:50"
   4) "user:65"
   5) "user:10"

127.0.0.1:6379> hgetall user:95
1) "name"
2) "Margarita Swift"
3) "birthday"
4) "1973-09-02"

4.2. Scanning large hashes (HSCAN)

HSCAN works similarly to SCAN: it provides cursors to incrementaly iterate over a large object. The command below fetches a single fake person data from the API with all the attributes (address, phones, email, etc), and replaces the characters {, }, :, and , from the JSON response by whitespaces, transforming the JSON data into a list of fields that can be stored into Redis.

$ curl -s https://api.namefake.com/english-united-states | \
          sed -e 's/[{:,}]/ /g' | \
          xargs redis-cli hmset large_object
OK

Let’s iterate the stored hash using HSCAN.

$ redis-cli
127.0.0.1:6379> HSCAN large_object 0
1) "38"
2)  1) "bonus"
    2) "45"
    3) "url"
    4) "https \\/\\/api.namefake.com\\/english-united-states\\/male\\/84146c4523e4671cd156f510f70890c7"
    5) "ipv4_url"
    6) "\\/\\/myip-address.com\\/ip-lookup\\/167.227.68.13"
    7) "company"
    8) "Mills Ltd"
    9) "email_url"
   10) "\\/\\/emailfake.com\\/isluntvia.com\\/asawayn"
   11) "name"
   12) "Richmond Jast II"
   13) "birth_data"
   14) "1994-05-02"
   15) "ipv4"
   16) "167.227.68.13"
   17) "macaddress"
   18) "2F 54 55 7A 8B 5F"
   19) "longitude"
   20) "-84.963336"
127.0.0.1:6379> HSCAN large_object 38
1) "17"
2)  1) "cardexpir"
    2) "03\\/24"
    3) "latitude"
    4) "36.808752"
    5) "sport"
    6) "Beach Volleyball"
    7) "domain"
    8) "lowe.com"
    9) "phone_h"
   10) "119.458.9198x08280"
   11) "height"
   12) "203"
   13) "eye"
   14) "Blue"
   15) "phone_w"
   16) "(825)100-6243x7548"
   17) "password"
   18) "d#3vfV(_]Hm]|D8"
   19) "email_d"
   20) "isluntvia.com"
   21) "maiden_name"
   22) "Haley"
127.0.0.1:6379> HSCAN large_object 17
1) "39"
2)  1) "username"
    2) "hklein"
    3) "weight"
    4) "55.7"
    5) "domain_url"
    6) "\\/\\/myip-address.com\\/ip-lookup\\/lowe.com"
    7) "address"
    8) "9129 Sallie Island\\nNew Hyman  AZ 45244"
    9) "uuid"
   10) "181f9097-8989-3e84-b888-5ff3dd71c782"
   11) "blood"
   12) "A\\u2212"
   13) "pict"
   14) "10male"
   15) "color"
   16) "teal"
   17) "email_u"
   18) "asawayn"
   19) "useragent"
   20) "Opera\\/8.30 (Windows NT 5.2; sl-SI) Presto\\/2.9.279 Version\\/11.00"
127.0.0.1:6379> HSCAN large_object 39
1) "0"
2) 1) "plasticcard"
   2) "5107318503996156"
   3) "hair"
   4) "Wavy  Black"

5. Final Considerations

At the end of this walkthrough, almost all the essential stuff related to efficiently storing, retrieving, updating, deleting, and scanning (iterating) data using Redis is already covered. The next posts will cover more topics like RedisJSON and RedisSearch, Redis Pub/Sub capabilities, tools like node-redis, and problem-solving, like syncing external API data sources using Redis.

References

About Fábio Miranda

Photo of Fábio Miranda I joined Commit in Oct/2021 as Engineering Partner and it's been a wonderful self-knowledge journey/professional experience. I'm a graduate of ITA - Instituto Tecnológico de Aeronáutica/ Aeronautical Technological Institute - where I majored in Computer Engineering. I currently live in "Belém do Pará" (Brazil). When I'm not hard-working you can find me playing Nintendo games, watching Formula 1 races cheering for Lewis, or watching Marvel movies/series.