|
该帖已经被评为精华帖
|
|
|---|---|
| 作者 | 正文 |
|
时间:2005-03-10
JDK1.5 引入了 concurrent package, 提供了更多的concurrent 控制方法。
还提供了一个 ConcurrentHashMap 类。从API上看,是可以读写同步。多个thread可以同时读取,一个thread写的时候,其他thread都不能读写。 这是一个用处很广、很方便的类。我想,能不能在 jdk1.4 及以下版本也提供一个。于是查看了 ConcurrentHashMap的代码。 我本以为,实现思路应该是用到了 ReadWriteLock. 大致是这样的思路。 [code:1] // my guess class CocurrentHashMap { Private Map map = null; final ReadWriteLock rwLock = new …. ; final Lock readLock = rwLock.readLock(); final Lock writeLock = rwLock.writeLock(); // decorate the map as concurrent public CocurrentHashMap(Map m){ map = m; } // all write method, like put, putAll, remove, clear public void putAll(Map m){ writeLock.lock(); try{ map.putAll(m); }finally{ writeLock.unlock(); } } // all read method. like get, containsKey, containsValue, entrySet() public Object get(Object key){ readLock.lock(); try{ return map.get(key); }finally{ readLock.unlock(); } }; // as we can see, in such places it is convenient to use AOP here. :-) [/code:1] 看了java 1.5 code,才知道不是这样。ConcurrentHashMap是一个比较复杂的类,自己实现了Lock。 不过也无妨,上面的思路很简单,我也可以找到 third party concurrent.jar, 按照上面自己的猜测思路为 jdk1.4 (and below) 写一个concurrent map. 这时候,我又想到。Read Lock也是lock。而某些情况下,只是初始化或者 refresh的时候,map 需要write,大多数情况下,都是 read。能不能继续减少 Read Lock 的 overhead? 我采取了这样一个思路。维护两个 map. 一个是 map to read, read only. 一个是 map to write, write only. 当用户使用 read method 的时候,就用 map to read ; 当用户使用 write method的时候,就写入到 map to write,之后还要copy 一份map to write ,然后把这个copy 赋给map to read。 所以,read 的时候,非常快,几乎没有 overhead。而write的时候,非常 expensive, 每次write完之后,都要 copy 一遍。所以,建议尽量使用 putAll() method。 这个思路还有一点不利,就是 同时维护两个同样的entry 的 map. 空间上的效率不是很好,虽然,key, value pair都是object reference, 但map 的entry set结构要有两份。 代码如下。为了效率,没有用AOP。:-) [code:1] /* * Read Write Map * * Write is expensive. * Read is fast as pure HashMap. * * Note: extra info is removed for free use */ package net.sf.map; import java.util.Collection; import java.util.Map; import java.util.Set; import java.util.HashMap; import java.util.Collections; public class ReadWriteMap implements Map { protected volatile Map mapToRead = Collections.unmodifiableMap(getNewMap()); protected final Map mapToWrite = Collections.synchronizedMap(getNewMap()); // you can override it as new TreeMap(); protected Map getNewMap(){ return new HashMap(); } // copy mapToWrite to mapToRead protected void sync(){ Map newMapToRead = getNewMap(); newMapToRead.putAll(mapToWrite); mapToRead = Collections.unmodifiableMap(newMapToRead); } // read methods public int size() { return mapToRead.size(); } public boolean isEmpty() { return mapToRead.isEmpty(); } public boolean containsKey(Object key) { return mapToRead.containsKey(key); } public boolean containsValue(Object value) { return mapToRead.containsValue(value); } public Collection values() { return mapToRead.values(); } public Set entrySet() { return mapToRead.entrySet(); } public Set keySet() { return mapToRead.keySet(); } public Object get(Object key) { return mapToRead.get(key); } // write methods public void clear() { mapToWrite.clear(); sync(); } public Object remove(Object key) { Object o = mapToWrite.remove(key); sync(); return o; } public Object put(Object key, Object value) { Object o = mapToWrite.put(key, value); sync(); return o; } public void putAll(Map t) { mapToWrite.putAll(t); sync(); } } [/code:1] 希望这段代码能够帮助 有类似需求的人。 声明:JavaEye文章版权属于作者,受法律保护。没有作者书面许可不得转载。
|
|
| 返回顶楼 | |
|
时间:2005-03-10
hehe,Doug Lea大佬早就有现成的包了,1。5就是把他的包弄进去的。
1。4以及以前的可以用Doug Lea的那个包。 http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html |
|
| 返回顶楼 | |
|
时间:2005-03-10
仔细看了api,好像跟buaawhl介绍的不一样啊。ConcurrentHashMap 支持并发的读写,从外部看无lock啊。从实现上来看,把table分成多个段,并发访问多个不同段,不会导致锁定。
|
|
| 返回顶楼 | |
|
时间:2005-03-10
mochow 写道 hehe,Doug Lea大佬早就有现成的包了,1。5就是把他的包弄进去的。
1。4以及以前的可以用Doug Lea的那个包。 http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html 以前我看过那个包,但没有仔细看。只记得有ReadWriteLock等,所以才有用ReadWriteLock为jdk1.4实现的那个想法。 这次多谢提醒,我翻了一下,找到了 ConcurrentHashMap, ConcurrentReaderHashMap。 ConcurrentHashMap 和 jdk1.5 的很像。 ConcurrentReaderHashMap 的目的,和我写的 那段一样。 引用 A version of Hashtable that supports mostly-concurrent reading, but exclusive writing. 但是它的 read ( get, contains etc ) 方法里面用到了synchronized。还是要获得系统锁,(jvm monitor) 我后面写的ReadWriteMap 在 read 的时候,不需要获取任何锁。 一个thread 改写 mapToWrite的时候,其他 thread 可以无差别的同时读取 mapToRead。 关键在于 1. mapToRead 定义为 volatile. 2. mapToRead 从来就没有写操作,只有赋值。只有mapToRead = 这样的语句,而没有 mapToRead.put(), putAll() 这样的写操作。 nihongye 写道 仔细看了api,好像跟buaawhl介绍的不一样啊。ConcurrentHashMap 支持并发的读写,从外部看无lock啊。从实现上来看,把table分成多个段,并发访问多个不同段,不会导致锁定。 对。那个内部类 Segment extends ReentrantLock。 里面的 readValueUnderLock 方法里面用了lock. [code:1] /** * Read value field of an entry under lock. Called if value * field ever appears to be null. This is possible only if a * compiler happens to reorder a HashEntry initialization with * its table assignment, which is legal under memory model * but is not known to ever occur. */ V readValueUnderLock(HashEntry<K,V> e) { lock(); try { return e.value; } finally { unlock(); } } [/code:1] |
|
| 返回顶楼 | |
|
时间:2005-03-11
关于synchroniszed和volatile, Doug Lea有如下文字:
引用 When locking presents liveness or performance problems for a given class or program, usually the best solution is to refactor the design to use one or more of the other approaches presented in this chapter. However, there are cases where the basic logic of a synchronization-based design can be maintained even when some of the synchronized method qualifiers or blocks are removed, although sometimes at the expense of weakened semantic guarantees. Synchronizing a field accessor method sometimes (but by no means always) adds noticeable overhead to programs. Two considerations enter into any decision about whether synchronization of an accessor method can be removed: Legality. The value of the underlying field never assumes an illegal value; that is, you can ensure that the field never, even momentarily, breaks invariants. (This by definition excludes fields of type long and double.) Staleness. Clients do not necessarily require the most recently updated value of a field, but can live with possibly stale values (see § 2.2.7). If a field is not always legal, then the choices are: Synchronize the accessor (as well as all update methods). Ensure somehow that clients realize when they have obtained an illegal value and take evasive action (for example via double-checks — see § 2.4.1.2). Omit the accessor method. This applies surprisingly often. Ask yourself why any client would want to know the value of a field and what they could do with this value. Because object attributes can change asynchronously in concurrent programs, a value obtained by a client in one line of code may have changed before the next line of code is executed. Thus, accessor methods are not frequently useful in concurrent programs. Moreover, because they are not frequently useful, synchronization is unlikely to be a performance concern even if you do not remove the accessor methods in such cases. If a field is always legal but staleness is not acceptable, then you have the additional choice: Remove synchronization from the accessor, and qualify the instance variable as volatile. However, this works as expected only for scalar types, not references to arrays or Objects that help maintain object representations: accessing a volatile reference does not automatically ensure visibility of the fields or elements accessible from that reference. (In the case of Object references, you can if necessary further ensure that the accessed fields are themselves volatile.) The main disadvantage of this approach is that volatile declarations impede compiler optimizations of methods using these fields, and so can lead to a net performance loss. If a field is always legal and staleness is acceptable, then you also have the choices: Remove synchronization from the accessor without any further alteration. If you like to live dangerously, just make the field public. As an example, consider the size() method of the ExpandableArray class (§ 2.2.2), that returns the value of the size field. Inspection of all methods reveals that the value of the size field is always legal — it never assumes a value outside the range 0..data.length. (This would not be true if, for example, size were temporarily set to -1 as a resize indicator flag inside the add method.) Assuming that this constraint is documented as an internal requirement for all subclasses and future modifications, then synchronization can be removed from the accessor. The decision about staleness and need for volatile is a matter of judgment about possible usage contexts that here interacts with (among other issues) choice of traversal strategies. If clients mainly traverse elements using indexed loops: for (int i = 0; i < v.size(); ++i) // questionable System.out.println(v.get(i)); then obtaining stale values of size is not likely to be acceptable. For example, a client might obtain the value zero even when there are many elements in the array, thus skipping the loop entirely. Note that if the loop ran at all, the synchronization performed in the first invocation of method get would force a fresher value of size to be returned on the second and subsequent calls to size(). (Also recall from § 2.2.3 that clients must in any case be prepared for the size to change between the index check and the element access, so this traversal style is problematic at best anyway.) However, if either aggregates or iterators are used, each performing internal synchronization, then you could make an argument for leaving the size() method unsynchronized and the size field non-volatile, and advertising the method as only a heuristic estimate of the current number of elements. Still, this is unlikely to be acceptable to clients of a general-purpose class such as ExpandableArray. But similar reasoning may be invoked when it is acceptable for clients to obtain values that are guaranteed to be only as recent as the last synchronization points of reading and writing threads. 而且关于volatile还查到如下: 引用 Note however that volatile has been incompletely implemented in most JVMs. Using volatile may not help to achieve the results you desire (yes this is a JVM bug, but its been low priority until recently). 谁用的多,讲讲好不好用?[/b] |
|
| 返回顶楼 | |
|
时间:2005-03-11
java language spec
http://java.sun.com/docs/books/jls/second_edition/html/jTOC.doc.html jvm spec http://java.sun.com/docs/books/vmspec/2nd-edition/html/VMSpecTOC.doc.html 里面讲了volatile 的定义。 volatile 的 lock 粒度最小,是变量级别的,保证 变量的 读/写同步。 这里的写 是变量作为 左值 (=左边) 来用,assign value = 。 主要用处: 1. 计数。 volatile int count; 这样 count ++ 在多thread情况下,不会丢失计数。 2. multiple-thread-writable long, double long 和 double 的字节 很多,最容易出现 高、低位 写乱了的情况。所以,如果 multiple thread writable, 那么 应该用 volatile 修饰。这时候,jvm 按理应该保证 write/read 同步。但也许没有支持。 我的上述代码中,那个 volatile 只是起个强调的作用。不用也问题不大。 mapToRead 是一个 object reference。主要是为了保证这个 Object Reference = 的 读写同步。 Object reference 的字节个数并不如 double, lang 那么多,而且 Object Reference = 是 jvm 里面最基本的操作,即使不加上volatile, jvm 这个操作是能保证的。所以,我的代码中,即使去掉volatile, 也不会有大问题。 前面 Doug Lea 的文字中,强调的是, 引用 does not automatically ensure visibility of the fields or elements accessible from that reference 说的是 Object Reference 对应的 structure 本身的 fieild, 比如, get method等。 我的那段代码不能保证每次 都是获取的是最新的 data。但能够保证(entry set)数据结构不会损坏。 比如,thread 1 写 mapToWrite, 还没有写完,这时候,Thread 2 读 mapToRead, 可以马上读出来old value。 因为 这里的同步粒度很小,就在 mapToWrite 上 (synchronzied map decorated). |
|
| 返回顶楼 | |
|
时间:2005-03-11
既然每次写的时候都需要手工同步一下读写的值,read的变量又是volatile,单独有一个write变量也需要占用空间,而且还需要花费力气让两者同步,直接定义一个变量是volatile就行了吧.这里没看出分开的必要.
http://www.javaperformancetuning.com/news/qotm030.shtml |
|
| 返回顶楼 | |
|
时间:2005-03-11
mochow 写道 既然每次写的时候都需要手工同步一下读写的值,read的变量又是volatile,单独有一个write变量也需要占用空间,而且还需要花费力气让两者同步,直接定义一个变量是volatile就行了吧.这里没看出分开的必要.
http://www.javaperformancetuning.com/news/qotm030.shtml 同感 |
|
| 返回顶楼 | |
|
时间:2005-03-11
buaawhl的设计思路非常好,也是就写操作在一个独立的hashmap上进行,这样,读操作不需要同步,读操作可以获得最大的性能。
这在读操作频繁而写操作极少的情况下非常有效,比如权限管理,而且设计是如此的简单,令人喜不自禁。 不过,buaawhl可能写得比较匆忙,维护了两个hashmap,空间上需要两倍,这里可以改进一下,只需要维护一个hashmap既可。 [code:1] import java.util.Collection; import java.util.Map; import java.util.Set; import java.util.HashMap; import java.util.Collections; public class ReadWriteMap implements Map { protected volatile Map mapToRead = Collections.unmodifiableMap(getNewMap()); // protected final Map mapToWrite = Collections.synchronizedMap(getNewMap()); // you can override it as new TreeMap(); protected Map getNewMap() { return new HashMap(); } // make a copy of mapToWrite protected Map copy(){ Map newMapToRead = getNewMap(); newMapToRead.putAll(mapToRead); return newMapToRead; } // read methods public int size() { return mapToRead.size(); } public boolean isEmpty() { return mapToRead.isEmpty(); } public boolean containsKey(Object key) { return mapToRead.containsKey(key); } public boolean containsValue(Object value) { return mapToRead.containsValue(value); } public Collection values() { return mapToRead.values(); } public Set entrySet() { return mapToRead.entrySet(); } public Set keySet() { return mapToRead.keySet(); } public Object get(Object key) { return mapToRead.get(key); } // write methods public synchronized void clear() { Map map=copy(); map.clear(); mapToRead=map; } public synchronized Object remove(Object key) { Map map=copy(); Object o=map.remove(key); mapToRead=map; return o; } public synchronized Object put(Object key, Object value) { Map map=copy(); Object o=map.put(key,value); mapToRead=map; return o; } public synchronized void putAll(Map t) { Map map=copy(); map.putAll(t); mapToRead=map; } } [/code:1] |
|
| 返回顶楼 | |
|
时间:2005-03-11
octfor 写道 不过,buaawhl可能写得比较匆忙,维护了两个hashmap,空间上需要两倍,这里可以改进一下,只需要维护一个hashmap既可。 好。 octfor的写法,整个结构清晰明了了许多,空间也进一步节省 (虽然写的同步粒度进一步加大,但写 本来就是假设很少发生的事情)。 比我的写法优越许多,我改成你的写法。我当时只考虑到,write 和 read应该在不同的 hashMap上,没有想到,mapToWrite完全可以是一个临时map。 Thanks. 其中还可以有一点 小改进。:-) 引用 public synchronized void clear() { mapToRead=getNewMap(); } |
|
| 返回顶楼 | |










