This paper provides a "theoretical
treatment of software protection" by reducing the problem to the problem
of efficient simulation on oblivious RAM. It tries to
create a smart system which can defeat the adversary who is
trying to duplicate the software.
The problem with our normal software (even if
encrypted) is that, adversary can view the memory access patterns made by the
processor and infer the underlying algorithm. The adversary can run the program
multiple times and record the patterns of memory accesses each time. As a
simple example, take the binary search algorithm.
Example Attack on Normal Systems
The processor will initially access the middle
element and the depending on that element, will either access the middle
element in the left half or the middle element in the right half and so on. The
adversary has access to the memory and can see the (encrypted) data flowing
from memory to the CPU. By noticing this access pattern the adversary can
ascertain with high confidence that binary search is used in the underlying
software.
So there is a need to hide the data in the
memory as well as the memory accesses (type and address) made by the CPU. The
adversary should not be able to infer anything new by looking at the access
pattern.
Role of Hardware
The paper claims that
purely software-based solution is impossible to prevent piracy since any
software (even if encrypted) can be copied bit by bit and can be run on any
machine. Therefore a Software-Hardware-package consisting
of a physically shielded CPU and an encrypted program
needs to be used.
Shielded CPU:
· Has a small amount of storage
(registers)
· Will run the encrypted program using the
memory and other components of the computer
· Has to be physically shielded (meaning
adversary cannot see what is happening inside and cannot interfere with it in
any way)
The adversary tries to learn the software by
initiating executions of the encrypted program on a choice of inputs and
watching both the memory contents and the communication between CPU and
memory.
The CPU defeats the adversary if no probabilistic
polynomial-time adversary can distinguish
So there are two types of protection that is
needed
1. The program along with the data needs to be
encrypted
2.
The access patterns must be hidden
The example of binary search shows that
encryption alone is not sufficient. We call a machine oblivious if
the sequence in which it accesses memory locations is equivalent for any two
inputs with the same running time.
Hiding Access Patterns – Trivial Method
Whenever CPU needs to access a variable, the
entire contents of the memory is read and written in some fixed order. This
shows that it is possible to completely hide the access pattern.
For deterministic schemes, this is the best technique. To do better, randomization needs to be
introduced.
Hiding Access Patterns – ORAM
To hide the access patterns, RAM need to be
simulated in an oblivious way. To do that ORAM technique is used.
Let M be the
RAM we need to simulate and let m=|M|
Number of
accesses is t.
To simulate RAM
of size m, ORAM of size m+2√m is required
The ORAM can be visualized as
<Virtual>
|
<Dummy>
|
<Shelter>
|
m
|
√m
|
√m
|
Each entry in the ORAM is of the form (vi,di) where vi is the address in the original RAM and di is the data stored at vi.
√m dummy
values along with the m actual contents of the RAM are permuted in some random
order using the permutation π(1, ..., m +√m ) and
placed in the first m +√m locations (Virtual + Dummy) of the ORAM (rest are filled with
empty values). This permutation is done using Batcher's sorting network
and the randomization comes from inside the secure CPU. So after this
permutation, the adversary cannot make out the mapping of the
order in ORAM to the actual order.
The CPU keeps a counter count that counts the number of memory
accesses. Initially this is kept to zero.
1. The CPU is
trying to access address vi .
2. Initialize found
flag = false
3. First go to the
shelter and scan (read) every element. Even if the address has been found,
continue scanning. Total √m accesses. If vi is
found, set flag=true.
4. If flag=false,
go to π(i) and
take it to CPU.
·
Write it back to count position
of shelter.
5. If flag=true, go
to π(m+count). This will be in the dummy part of ORAM. This is done so that adversary cannot learn
whether a new address was accessed or an old one.
6. After
accessing the updating is done on the shelter. If the address was found on the
shelter, then that location is updated. Else the leftmost empty is updated.
7. If count=m, permute again and repeat.
In order to permute again, sort first based on v and then based on σ where σ=0 means the value was
modified and σ=1 means it is
unmodified. Then all the entries with σ=1 is made as empty
in the sorted order. The permutation has to ensure that all the empty values
will go to the shelter.
After the permutation, the first m+√m entries will have the m data along with some dummy data. The
process can be repeated then.
Correctness
All the memory accesses look the same to
the adversary.
Step 3 - √m accesses are fixed
Step 4 and Step 5 – are indistinguishable
because the original data has been randomized throughout the first m+√m entries.
Step 6 - √m accesses are fixed
References:
No comments:
Post a Comment