When reading the paper, I had two additional questions that I think might be important

Some questions about the design about documents HOT 10 CLOSED

dp-3t commented on August 20, 2024 1

Some questions about the design

from documents.

Comments (10)

arelaxend commented on August 20, 2024 1

Good.

What is the proposition ?
User devices must send dummy data "of the same size" to the central system also if there are not Covid positive.

What is the purpose of the proposition ?
If the protocol does not mention the proposition, there is a threat to man and the middle attacks.

Without dummy data, an attaquant can infer that some devices with IPs address x send network packets to central servers. And that therefore, those IP addresses belong to some people Covid positive.

Whereas with dummy data "of the same size", an attaquant can simply infer that some IP addresses have installed the application.

Why does it solve the problem ?
An attaquant cannot process a packet without the right certificate. This is the purpose of using a secure connection.
But can start listening the network and say "there has been an exchange from A to central servers". Going further the attaquant can infer that the exchanges are of type "daily keys" by looking at the size / frequency / number of packets or other metadata.
If some device exchange to central servers this means that those device are Covid positive.

Does is apply to the Google/Apple protocol ?
Yes

Is there any reference of this issue in the DP3T papers ?
Yes and no.
In the DP3T - Data Protection and Security paper, one can read:

To combat traffic analysis, apps will periodically upload data to the epidemiology research
lab, even if they have not been in contact with an infected person. More precisely, the app
picks a schedule for uploading data to the epidemiology research lab (e.g., once every 2
days). For every scheduled event, the phone creates an encrypted connection to the lab,
and uploads either the real data, or a dummy message of the same size.

More on this for the case of HTTPS

The HTTPS protocol is encrypted above the HTTP application layer. That is the GET request (full URL) is encrypted in the HTTP header and an application eavesdropping on the network traffic will not be able to decrypt the traffic.

That said, you could log the IP addresses (especially those connected to servers on port 443 - HTTPS) as the IP layer is not encrypted with HTTPS.

This is what your netstat command does. It looks for TCP connections on your network card, and notes which ones are connecting to port 443 and observes the IP address of the HTTPS servers you are connecting to.

Alexandre

from documents.

commented on August 20, 2024

Thanks for addressing my questions. When I read the paper, I thought that sending the dummy data to the epidemiologists is a good approach to avoid this type of attack.I hope this will be added to the design for the central system as well. I do not see clear objections for doing so.

from documents.

arelaxend commented on August 20, 2024

I have referenced this into another thread since you have two questions.
#144

from documents.

commented on August 20, 2024

Thanks!

from documents.

lbarman commented on August 20, 2024

Thank you both for your inputs.

What are the number of false positives and false negatives that can be expected with this technology?

This question depends on many things we don't control yet. Assuming 100% of the population uses the app correctly, and that contact-tracing on the Bluetooth layer works at 100% (which is not the case due to loss rates etc), then: FP very close to 0% (you are never declared at risk wrongly, except if there is a collision in the EphID space), and FN 0%.
What this really means is that these numbers are not dictated by the design (centralized vs decentralized) but more by the Bluetooth technology (which will be common for many systems) and the adoption rate.
For completeness, we note that Design 2 uses Bloom filters which increase the FP rate, but only "negligibly" if done correctly (I don't have precise numbers on this).

Why is dummy data sent to the epidemiologists, but not to the central system?

Dummy data should be sent to both, indeed.

Does it answer your questions?

from documents.

commented on August 20, 2024

Thank you very much for your reply. I agree that many of the factors that impact the FP and FN rates are out of the designer's control, although not all of them (i.e., using Bluetooth). I respectfully disagree that choosing between a decentralised or centralised design cannot impact the FN rate. Because of the decentralised design, there is no centralised party that can construct a social graph. However, it could lead to more FNs. For example, if new infections are found for which the source is unknown, than with a more complete social graph it would me much easier to find a common source of infection and warn its other contacts that might be infected as well.

I completely agree with choosing for a decentralised solution. However, it is important to have insight into what we cannot do with it (or can only do more difficult) and what risks this poses for FNs and FPs. This provides the opportunity to further improve the design as well as get insight into the appropriate policies and organisational measures that need to be in place to deal with them.

Examples of a change to the design is adding data that says whether someone infected was alerted by the app or not and send that to the epidemiologists, so appropriate measures and maybe additional human contact tracing can be performed when necessary. Of course, the effects on privacy and accuracy of doing so (e.g. someone could be infected not by the one that caused them to receive an alert) should be considered carefully. Examples of policies are reserving resources for contact tracing by humans in that case and ensure that these resources are used effectively and efficiently. Another example is promoting the app among the elderly such that there are no large groups that do not have the app where the infection can go around freely out of sight.

If it does not exist already, it might be useful to at least have a list of factors that could possibly influence FP and FN rates and how they can be manipulated either by the design of the system or policies and organizational measurements.

from documents.

lbarman commented on August 20, 2024

For example, if new infections are found for which the source is unknown, than with a more complete social graph it would me much easier to find a common source of infection and warn its other contacts that might be infected as well.

Ah, I see! You mean with full graphs in the backend, missing some infectious reports might lead to less FN (because Alice and Bob were sick and their only common friend is Claire who did not perform the report for reason X Y Z, but the centralized system can still preemptively warn all of Claire's friends). Interesting, I haven't thought of that before. Do you know a system that does this ?

To play the devil's advocate, I'm not sure how much false positives this generates ;)

If it does not exist already, it might be useful to at least have a list of factors that could possibly influence FP and FN rates and how they can be manipulated either by the design of the system or policies and organizational measurements.

It would certainly be interesting! I'll raise the issue internally. I'm interested if you know of a system that does the reconstruction-thingy you mentioned.

Thanks!

from documents.

commented on August 20, 2024

Yes, that is exactly what I mean! I agree that it is likely that this could generate more FPs. The question is, however, what is worse and what risks of FPs and FNs should and should we not accept.

With the reconstruction-thingy, I think you are referring to constructing the social graph. To construct a social graph, you need a list of IDs and connections between them. So, I guess that this would be possible any time this data is stored centrally. According to this white paper of ACLU, China and Israël are constructing these graphs using cellphone data (it does indeed mention a high FP-rate for these solutions).

Great that you will raise the issue internally. I have a background in sociotechnical research. If you need input on this or if I can help, just let me know. I think most of the factors on the list can be derived quite easily from the design and the interaction with the social system. Determining how large their impact is will be far more difficult, however, we might not need to know the exact numbers to know how to deal with them.

from documents.

lbarman commented on August 20, 2024

Sure! For now I think we're not there yet ;) but to be kept in mind. Thanks for your input !

from documents.

commented on August 20, 2024

No problem!

from documents.

Some questions about the design about documents HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent