引子
redis是一个很强大的数据结构存储的nosql数据库,很方便针对业务模型进行效率的优化。最近我的工作是负责对现有Java服务器框架进行整理,并将网络层与逻辑层脱离,以便于逻辑层和网络层的横向扩展。 尽管我在逻辑层上使用了AKKA作为核心框架,尽可能lockfree,但是还是免不了需要跨jvm的锁。所以我需要实现一个分布式锁。
官方的实现
官方在SETNX 这一页给了一个实现。
- C4 sends SETNX lock.foo in order to acquire the lock
- The crashed client C3 still holds it, so Redis will reply with 0 to C4.
- C4 sends GET lock.foo to check if the lock expired. If it is not, it will sleep for some time and retry from the start.
- Instead, if the lock is expired because the Unix time at lock.foo is older than the current Unix time, C4 tries to perform: GETSET lock.foo (current Unix timestamp + lock timeout + 1)
- Because of the GETSET semantic, C4 can check if the old value stored at key is still an expired timestamp. If it is, the lock was acquired.
- If another client, for instance C5, was faster than C4 and acquired the lock with the GETSET operation, the C4 GETSET operation will return a non expired timestamp. C4 will simply restart from the first step. Note that even if C4 set the key a bit a few seconds in the future this is not a problem.
但是使用官方推荐的getset实现的话,未竞争到锁的一方确实可以判断到自己未能竞争到锁,但却将持有锁一方的时间修改了,这样的直接后果就是,持有锁的一方无法解锁!!!
基于lua的实现
其实官方实现出现的问题,是因为使用redis独立的命令不能将get-check-set这个过程进行原子化,所以我决定引入redis-lua,将get-check-set这个过程使用lua脚本来实现。
加锁:
- script params: lock_key, current_timestamp, lock_timeout
- setnx lock_key (current_timestamp + lock_timeout). if not success, set lock_key (current_timestamp + lock_timeout) if current_timestamp > value
- client save current_timestamp(lock_create_timestamp)
解锁:
- script params: lock_key, lock_create_timestamp, lock_timeout
- delete if lock_create_timestamp + lock_timeout == value
具体的实现:
---lock
local now = tonumber(ARGV[1])
local timeout = tonumber(ARGV[2])
local to = now + timeout
local locked = redis.call('SETNX', KEYS[1], to)
if (locked == 1) then
return 0
end
local kt = redis.call('type', KEYS[1]);
if (kt['ok'] ~= 'string') then
return 2
end
local keyValue = tonumber(redis.call('get', KEYS[1]))
if (now > keyValue) then
redis.call('set', KEYS[1], to)
return 0
end
return 1
---unlock
local begin = tonumber(ARGV[1])
local timeout = tonumber(ARGV[2])
local kt = redis.call('type', KEYS[1]);
if (kt['ok'] == 'string') then
local keyValue = tonumber(redis.call('get', KEYS[1]))
if ((keyValue - begin) == timeout) then
redis.call('del', KEYS[1])
return 0
end
end
return 1
已知问题
redis的分布式锁会有单点的问题。当然我们的业务量也没有达到挂掉专门做锁的redis单点的水平。
新年快乐
在文章的最后祝各位朋友新年快乐,身体健康,家庭幸福,工作顺利,心想事成,马上有钱!!!