问个CPU设计的问题,如何实现原子加操作
谢谢!
ll-sc
http://en.wikipedia.org/wiki/Load-link/store-conditional
In X86 structure, different core or different processor in a cluster has its owner segment register. so they work on their own program segment. if they want to access some common resource, such as IO operation, the program should do task switch according to GDT/LDT records to call OS' kernal help.
(1)Guaranteed atomic operations
(2)Bus locking, using the LOCK# signal and the LOCK instruction prefix
(3)Cache coherency protocols that ensure that atomic operations can be carried out on cached data structures (cache lock)
比如我有一个Mac表,不同的CPU收到不同的报文,都需要对这个Mac进行更新,如何实现呢?
看看操作系统课程
pv操作
然后再学习如何实现pv操作
比如前面有人提到的ll-sc
我大约看过操作系统的书,但对“如何实现pv操作”描述的都不是很清楚。
有什么书,或网站对这个描述得比较细致的吗?
看看并行计算机体系结构 里面关于对存储一致性的内容
共享维基里的一段描述:
http://en.wikipedia.org/wiki/Atomic_operation:
Cache coherence mechanismsDirectory-based coherence: In a directory-based system, the data being shared is placed in a common directory that maintains the coherence between caches. The directory acts as a filter through which the processor must ask permission to load an entry from the primary memory to its cache. When an entry is changed the directory either updates or invalidates the other caches with that entry.
Snooping is the process where the individual caches monitor address lines for accesses to memory locations that they have cached. When a write operation is observed to a location that a cache has a copy of, the cache controller invalidates its own copy of the snooped memory location.
Snarfing is where a cache controller watches both address and data in an attempt to update its own copy of a memory location when a second master modifies a location in main memory. When a write operation is observed to a location that a cache has a copy of, the cache controller updates its own copy of the snarfed memory location with the new data.
Distributed shared memory systems mimic these mechanisms in an attempt to maintain consistency between blocks of memory in loosely coupled systems.
The two most common types of coherence that are typically studied are Snooping and Directory-based, each having its own benefits and drawbacks. Snooping protocols tend to be faster, if enough bandwidth is available, since all transactions are a request/response seen by all processors. The drawback is that snooping isn't scalable. Every request must be broadcast to all nodes in a system, meaning that as the system gets larger, the size of the (logical or physical) bus and the bandwidth it provides must grow. Directories, on the other hand, tend to have longer latencies (with a 3 hop request/forward/respond) but use much less bandwidth since messages are point to point and not broadcast. For this reason, many of the larger systems (>64 processors) use this type of cache coherence.
在x86下,对于加lock前缀的ALU指令,如果在read-modify-write过程中接收到invalidate消息(cache
coherence protcol中的),则要重新执行read-modify-write。
实际实现时还要面对很多细节,如对跨cache line,跨page都需要处理,避免互相invalidation/intervention导
致活锁,不同的微结构设计的处理办法不同。
