Thursday, October 1, 2015

Software Protection and Simulations on Oblivious RAMs

This paper provides a "theoretical treatment of software protection" by reducing the problem to the problem of efficient simulation on oblivious RAM. It tries to create a smart system which can defeat the adversary who is trying to duplicate the software. 
The problem with our normal software (even if encrypted) is that, adversary can view the memory access patterns made by the processor and infer the underlying algorithm. The adversary can run the program multiple times and record the patterns of memory accesses each time. As a simple example, take the binary search algorithm.  

Example Attack on Normal Systems

The processor will initially access the middle element and the depending on that element, will either access the middle element in the left half or the middle element in the right half and so on. The adversary has access to the memory and can see the (encrypted) data flowing from memory to the CPU. By noticing this access pattern the adversary can ascertain with high confidence that binary search is used in the underlying software. 
So there is a need to hide the data in the memory as well as the memory accesses (type and address) made by the CPU. The adversary should not be able to infer anything new by looking at the access pattern.  

Role of Hardware 

The paper claims that purely software-based solution is impossible to prevent piracy since any software (even if encrypted) can be copied bit by bit and can be run on any machine. Therefore a Software-Hardware-package consisting of a physically shielded CPU and an encrypted program needs to be used. 
Shielded CPU: 
·           Has a small amount of storage (registers) 
·          Will run the encrypted program using the memory and other components of the computer 
·          Has to be physically shielded (meaning adversary cannot see what is happening inside and cannot interfere with it in any way) 

The adversary tries to learn the software by initiating executions of the encrypted program on a choice of inputs and watching both the memory contents and the communication between CPU and memory. 
The CPU defeats the adversary if no probabilistic polynomial-time adversary can distinguish  
So there are two types of protection that is needed 
1.             The program along with the data needs to be encrypted 
2.              The access patterns must be hidden 
The example of binary search shows that encryption alone is not sufficient. We call a machine oblivious if the sequence in which it accesses memory locations is equivalent for any two inputs with the same running time. 

Hiding Access Patterns – Trivial Method 

Whenever CPU needs to access a variable, the entire contents of the memory is read and written in some fixed order. This shows that it is possible to completely hide the access pattern. For deterministic schemes, this is the best technique.  To do better, randomization needs to be introduced. 

Hiding Access Patterns – ORAM 

To hide the access patterns, RAM need to be simulated in an oblivious way. To do that ORAM technique is used. 
Let M be the RAM we need to simulate and let m=|M| 
Number of accesses is t
To simulate RAM of size m, ORAM of size m+2m is required 

The ORAM can be visualized as 

<Virtual>
<Dummy>
<Shelter>
m 
√m
√m

Each entry in the ORAM is of the form (vi,di) where vis the address in the original RAM and dis the data stored at vi

√m dummy values along with the m actual contents of the RAM are permuted in some random order using the permutation π(1, ..., m +√m ) and placed in the first m +√m locations (Virtual + Dummy) of the ORAM (rest are filled with empty values).  This permutation is done using Batcher's sorting network and the randomization comes from inside the secure CPU. So after this permutation, the adversary cannot make out the mapping of the order in ORAM to the actual order. 

The CPU keeps a counter count that counts the number of memory accesses. Initially this is kept to zero. 
1.       The CPU is trying to access address vi 
2.       Initialize found flag = false 
3.       First go to the shelter and scan (read) every element. Even if the address has been found, continue scanning. Total √m accesses. If vi is found, set flag=true
4.       If flag=false, go to π(i) and take it to CPU. 
·          Write it back to count position of shelter. 
5.       If flag=true, go to π(m+count). This will be in the dummy part of ORAM. This is done so that adversary cannot learn whether a new address was accessed or an old one. 
6.       After accessing the updating is done on the shelter. If the address was found on the shelter, then that location is updated. Else the leftmost empty is updated.
7.       If count=m, permute again and repeat. 
In order to permute again, sort first based on v and then based on σ where σ=0 means the value was modified and σ=1 means it is unmodified. Then all the entries with σ=1 is made as empty in the sorted order. The permutation has to ensure that all the empty values will go to the shelter.
After the permutation, the first m+√m entries will have the m data along with some dummy data. The process can be repeated then.

Correctness

All the memory accesses look the same to the adversary.
Step 3 - √m accesses are fixed
Step 4 and Step 5 – are indistinguishable because the original data has been randomized throughout the first m+√m entries.
Step 6 - √m accesses are fixed

References:
Goldreich, O., Ostrovsky, R.: Software protection and simulation by oblivious RAMs. JACM (1996)

No comments:

Post a Comment