1 基础

Redis 是存储结构为 key-value 的存储系统,其中 value 支持 String(字符串)、List(列表)、Set(集合)、Hash(Map)、SortSet(有序集合) 5 种类型。Reids 对这5中类型都提供了丰富的操作,而且这些操作都是原子性的。

1.1 key

二进制安全,意味着可以用任何二进制类型当做 key 值。比如 简单的字符串 “foo” 或 图片文件的内容都可以,空字符串也是有效的 key 值。

1.2 五种类型

  • String

最常用类型,可以是字符串、整数或者浮点数。可以当作字符串做拼接、取子串等;如果是数字类型,可以做自增或自减操作。 PS:所有测试代码均使用 Jedis 写,感觉比命令更容易理解。

// 字符串
jedis.set("str_key", "value");
System.out.println(jedis.get("str_key"));  // value

jedis.append("str_key", "123456");
System.out.println(jedis.substr("str_key", 1, 16));  // alue123456

// 数字
jedis.set("int_key", "123");
System.out.println(jedis.incr("int_key"));   // 124
System.out.println(jedis.decrBy("int_key", 4));  // 120

jedis.set("float_key", "123.23");
System.out.println(jedis.incrByFloat("float_key", 2)); // 125.23
  • List

链表结构,每一个结点都是一个字符串。可以从链表的两端推入(LPUSH、RPUSH)或者弹出元素(LPOP、RPOP)、根据偏移量对链表进行修剪(LTRIM)、读取单个或者多个元素(LRANGE)等等。

jedis.lpush("list", "a", "b");  // 头头部插 [b, a]
jedis.rpush("list", "y", "z");  // 从尾部插 [b, a, y, z]

System.out.println(jedis.lpop("list"));  // 输出: b
System.out.println(jedis.rpop("list"));  // 输出: z

jedis.lset("list", 0, "123");  // [123, y]

System.out.println(jedis.llen("list")); // 输出: 2

jedis.ltrim("list", 1, 1);
System.out.println(jedis.lpop("list"));  // 输出: y
  • Set

无序集合,元素类型是字符串,不可重复。可以操作集合元素、检查一个元素是否存在于集合中(ISMEMBER)、取交集(SINTER)、并集(SUNION)、差集(SDIFF)等。

jedis.sadd("set1", "a", "b", "a", "f");
jedis.sadd("set2", "a", "g");
jedis.sadd("set3", "b");
jedis.sadd("set4", "a");

// 所有元素
System.out.println(jedis.smembers("set1"));  // 输出: [a, b, f]
// 长度
System.out.println(jedis.scard("set1"));  // 输出: 3
// 是否包含
System.out.println(jedis.sismember("set1", "a"));  // 输出: true
System.out.println(jedis.sismember("set1", "z"));  // 输出: false

// sdiff 从第一个 set 中删除其它 set 中包含的元素
System.out.println(jedis.sdiff("set1", "set2", "set3"));  // 输出: [f]

// sdiffstore 和sdiff 一样,不同的是结果存到了一个新的 set
jedis.sdiffstore("set", "set1", "set2", "set3");
System.out.println(jedis.smembers("set"));   // 输出: [f]

// sunion 取并集。sunionstore 同sdiffstore
System.out.println(jedis.sunion("set1", "set2", "set3"));  // 输出:  [a, b, f, g]

// sinter 取交集。sinterstore 同上面两个
System.out.println(jedis.sinter("set1", "set2", "set4"));  // 输出:  [a]
  • Hash

包含键值对的无序散列表,当作是 Java.util.HashMap 用就可以

jedis.hset("hash", "key1", "123");

Map<String, String> data = new HashMap<>();
data.put("key2", "456");
data.put("key3", "789");
jedis.hmset("hash", data);

System.out.println(jedis.hget("hash", "key1"));  // 123

Map<String, String> map = jedis.hgetAll("hash");
System.out.println(map);  // {key1=123, key2=456, key3=789}
  • Sort Set

这应该很多人以往没有接触过的类型,存储的是字符串成员(member)与浮点数分值(score)之间的有序映射,元素的排列顺序由分值的大小决定。同 Set 类似,可以操作元素,取并集、交集,但是不能取差集;根据分值范围(range)或者成员来获取元素等。

jedis.zadd("z-set", 2.1, "21");
jedis.zadd("z-set", 2.6, "26");
jedis.zadd("z-set", 1.5, "15");
jedis.zadd("z-set", 1.8, "18");
jedis.zadd("z-set", 1.1, "11");

// 元素数
System.out.println(jedis.zcard("z-set"));  // 5
System.out.println(jedis.zcount("z-set", 1.5, 2.5));  // 3

// 按索引 、按score、按 member 取结果集
System.out.println(jedis.zrange("z-set", 0, 10));  // [11, 15, 18, 21, 26]
System.out.println(jedis.zrangeByScore("z-set", 1.5, 2.5));  // [15, 18, 21]
System.out.println(jedis.zrangeByLex("z-set", "[15", "(21" ));  // [15, 18]

// zrevrange 递减排列,zrevrangeByScore、zrevrangeByLex 一个意思
System.out.println(jedis.zrevrange("z-set", 0, 10)); // [26, 21, 18, 15, 11]

1.3 过期时间

Redis 提供了 EXPIRE 命令,指定元素存活时间。需要注意的是,只能对 key 设置过期时间,不能对 List、Set、Hash 中的每一项单独设置。

完整的命令列表 http://redis.io/commands#

2 事务

Redis 的事务和 Mysql 通过加锁、事务ID、 回滚指针、UNDO LOG 等机制实现的事务不一样。在写数据的时候没有实现悲观锁,而是乐观锁机制;没有 Rollback 功能,但能取消;如果事务中某命令失败,已经成功执行的命令不会回滚。

Redis 事务更应该理解为 “保证一批命令连续执行的机制”’, 使用5个命令实现这种事务的功能,分别是 WATCH、MULTI、EXEC、UNWATCH、DISCARD。

2.1 基本事务

  • MULTI 开始事务
  • EXEC 提交事务

Redis 服务端会在接收到 MULTI 命令后将其后续命令存入一个队列,接收到 EXEC 后一次性执行 MULTI 和 EXEC 之间的所有命令。在执行 EXEC 之前,不执行其它客户端发来的请求。 因为 Redis 是单线程的,所以这种机制很容易实现。使用方式:

Transaction tx = jedis.multi();
tx.set("test_aaa", "20");
tx.incr("test_aaa");

List<Object> list = tx.exec();
System.out.println(list);   // 输出结果: [OK, 21]

注意的是,因为批量执行的机制,所以我们不能实现在一个事务先查值,再根据值的内容进行操作的功能。

2.2 完整事务

如果某一客户端要对一个 key 进行事务操作,恰好在发送 MULTI 前,这个 key 被其他客户端请求修改了,那么这一次的事务操作就会产生错误的结果。为解决这个问题,需要使用另外3个命令

  • WATCH 监视一个(或多个) key ,如果在事务执行之前这个(或这些) key 被其他命令所改动,那么事务将被打断。
  • UNWATCH 取消 WATCH 命令对所有 key 的监视
  • DISCARD 取消一个事务。很简单,就是清空任务队列并退出任务上下文

如果 WATCH 的 key 被其它客户端请求改变,执行结果为null

jedis.set("test_tx", "1");
jedis.watch("test_tx");

otherClient.set("test_tx", "2");  // key=test_tx 被改变

Transaction tx = jedis.multi();
tx.incr("test_tx");

List<Object> list = tx.exec();
System.out.println(list);  // 结果: null
System.out.println(jedis.get("test_tx"));  // 结果: 2

如果 执行了 UNWATCH 取消监视,跟不加监视一样。

jedis.set("test_tx", "1");
jedis.watch("test_tx");

otherClient.set("test_tx", "2");  // key=test_tx 被改变

jedis.unwatch();  // 取消监视

Transaction tx = jedis.multi();
tx.incr("test_tx");
List<Object> list = tx.exec();

System.out.println(list);  // 结果: [3]

System.out.println(jedis.get("test_tx"));  // 结果: 3

取消事务

Transaction tx = jedis.multi();
tx.set("test_tx", "1");
tx.discard();
tx.exec();  // 结果: JedisDataException: ERR EXEC without MULTI

从异常信息可以看出,执行 EXEC 之前并没有执行 MULTI,说明任务队列已经被清空。

2.3 其它问题

Redis 只能保证事务的每个命令连续执行,但是如果事务中的一个命令失败了,并不回滚其他命令,比如:

Transaction tx = jedis.multi();

tx.set("test_aaa", "20");
tx.incr("test_aaa");

tx.set("test_bbb", "bbb");
tx.incr("test_bbb");

tx.set("test_ccc", "10");
tx.incr("test_ccc");

List<Object> list = tx.exec();
System.out.println(list);   // 输出结果:[OK, 21, OK, redis.clients.jedis.exceptions.JedisDataException: ERR value is not an integer or out of range, OK, 11]

3 持久化

持久化是将易丢失的瞬时数据转为持久数据的一种机制,对于 Redis 来说,就是将内存中的数据保存到磁盘上。Redis 有两种持久化方式,一种是 快照(Snapshotting),另一种叫 只追加文件(append-only file,AOF) ,这两种方式都可以将内存中的数据库状态保存到磁盘上,但是原理非常不同。

3.1 Snapshotting

将某一时刻的所有数据都写入硬盘,生成一个当前时间点的数据副本。这个副本可以用作备份,也可以复制到其它 Redis 服务器上创建出一个具有相同数据的服务器(eg:增加从库就这么干)。

通过配置 redis.conf 的一些选项可以控制创建快照的一些行为,如下是快照相关的默认配置(截取部分配置文件内容,方便查看原版注释。下同)。包含:

  • 什么时候创建快照
  • 创建失败后是否仍然就绪执行写命令
  • 是否启用压缩
  • 是否启用验证
  • 快照文件名
  • 快照文件的目录
################################ SNAPSHOTTING  ################################
#
# Save the DB on disk:
#
#   save <seconds> <changes>
#
#   Will save the DB if both the given number of seconds and the given
#   number of write operations against the DB occurred.
#
#   In the example below the behaviour will be to save:
#   after 900 sec (15 min) if at least 1 key changed
#   after 300 sec (5 min) if at least 10 keys changed
#   after 60 sec if at least 10000 keys changed
#
#   Note: you can disable saving completely by commenting out all "save" lines.
#
#   It is also possible to remove all the previously configured save
#   points by adding a save directive with a single empty string argument
#   like in the following example:
#
#   save ""

save 900 1
save 300 10
save 60 10000

# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes

# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes

# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes

# The filename where to dump the DB
dbfilename dump.rdb

# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./

3.2 AOF

在执行写命令时,将被执行的命令写到 AOF 文件的末尾,以此来记录数据发生的变化。可配置的选项主要有:

  • 是否执行持久化
  • 多长时间同步写命令到硬盘
  • 压缩时是否能执行同步操作
  • 指定Redis重写aof文件的条件
############################## APPEND ONLY MODE ###############################

# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.

appendonly no   # 开启/关闭 AOF 持久化方式

# The name of the append only file (default: "appendonly.aof")

appendfilename "appendonly.aof"

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# An AOF file may be found to be truncated at the end during the 
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes

3.2.1 同步机制

多长时间同步一次命令到硬盘文件。有3个选择,作用非别是:

  • appendfsync always : 每次收到写命令就立即强制写入磁盘,最慢的,但是保证完全的持久化,不推荐使用
  • appendfsync everysec : 每秒钟强制写入磁盘一次,在性能和持久化方面做了很好的折中,推荐
  • appendfsync no : 完全依赖os,性能最好,持久化没保证

3.2.2 重写机制

当 Redis 持续运行后,aof 文件的体积会逐渐增大,当删除一个足够大的文件时,可能会导致操作系统挂起数秒,并且文件中很多命令是多余的,一直维护这个大文件意义不大,因此 Redis 会通过删除冗余命令的方式对 aof 文件进行压缩/重写(rewrite)。

rewrite 默认触发条件为增长比例 100,大小为 64M。表示与上次 rewrite 的 aof 文件大小相比,当前 aof 文件增长量超过上次 aof 文件大小的100% 并且文件大于 64M 时,就会触发background rewrite。若配置为0,则会禁用自动 rewrite。文件最小大小的目的是用于阻止即使文件很小但是增长幅度很大也去重写 aof 文件。

需要注意到是重写 aof 文件的操作,并没有读取旧的 aof 文件。而是新开子进程将整个内存中的数据库内容用命令的方式重写了一个新的 aof 文件,之后的写命令继续写入新的 aof 文件。

比如,存在一个[num=100] 的数据,原 aof 文件中有 set(“num”, “1”) 和 99 个 incr 命令。重写后新的 aof 文件只有一个 set(“num”, “100”) 命令。

4 主从复制

两种方式开启复制功能,配置文件中设置 slaveof 或 在从机上执行 SLAVEOF 命令

4.1 复制过程

  1. 从机启动或执行SLAVEOF命令时清空本地数据,向主机发送同步请求
  2. 主机创建当前数据快照并缓存新收到的写命令
  3. 主机向从机发送创建好的快照文件和缓存的写命令
  4. 主机向从机同步发送写命令

另:

  • 如果主机创建快照过程中收到其它从机的请求,不会新创建快照文件,而是把这一份文件发送给两台从机
  • 从机也可以作为master,即复制的整体结构是一个树形的主从链