follow us on facebook follow us on twitter email us

The Unlinkable Data Challenge: Advancing Methods in Differential Privacy

About the Challenge
Propose a mechanism to protect personally identifiable information while maintaining a dataset's utility

Posted By: National Institute of Standards and Technology, Public Safety Communications Innovation Accelerator
Category: Software/Apps, Ideas, Designs, Scientific/Engineering
Skill: Algorithms Interest: Public Safety Submission Dates: 12 p.m. ET, May 01, 2018 - 5 p.m. ET, Aug 02, 2018

Submissions are in…Stage 1 is closed! 

Check back for winner information Sept 12th

Challenge Overview

JOIN US for this exciting challenge… designed to advance data privacy and public safety data!  This challenge is focused on proactively protecting individual privacy while allowing for data to be used by researchers for positive purposes and outcomes.  It is well known that privacy in data release is an important area for the Federal Government (which has an Open Data Policy), state governments, the public safety sector and many commercial non-governmental organizations.  Developments coming out of this competition would drive major advances in the practical applications of differential privacy for these organizations.

The purpose of this series of competitions is to provide a platform for researchers to develop more advanced differentially private methods that can substantially improve the privacy protection and utility of the resulting datasets.

Background

Databases across the country include information with potentially important research implications and uses, e.g. contingency planning in disaster scenarios, identifying safety risks in aviation, assist in tracking contagious diseases, identifying patterns of violence in local communities.  However, included in these datasets are personally identifiable information (PII) and it is not enough to simply remove PII from these datasets.  It is well known that using auxiliary and possibly completely unrelated datasets, in combination with records in the dataset, can correspond to uniquely identifiable individuals (known as a linkage attack).  Today’s efforts to remove PII do not provide adequate protection against linkage attacks. With the advent of “big data” and technological advances in linking data, there are far too many other possible data sources related to each of us that can lead to our identity being uncovered.

Get Involved – How to Participate

The Unlinkable Data Challenge is a multi-stage Challenge.  This first stage of the Challenge is intended to source detailed concepts for new approaches, inform the final design in the two subsequent stages, and provide recommendations for matching stage 1 competitors into teams for subsequent stages.  Teams will predict and justify where their algorithm fails with respect to the utility-privacy frontier curve.

In this stage, competitors are asked to propose how to de-identify a dataset using less than the available privacy budget, while also maintaining the dataset’s utility for analysis.  For example, the de-identified data, when put through the same analysis pipeline as the original dataset, produces comparable results (i.e. similar coefficients in a linear regression model, or a classifier that produces similar predictions on sub-samples of the data).

This stage of the Challenge seeks Conceptual Solutions that describe how to use and/or combine methods in differential privacy to mitigate privacy loss when publicly releasing datasets in a variety of industries such as public safety, law enforcement, healthcare/biomedical research, education, and finance.  We are limiting the scope to addressing research questions and methodologies that require regression, classification, and clustering analysis on datasets that contain numerical, geo-spatial, and categorical data.

To compete in this stage, we are asking that you propose a new algorithm utilizing existing or new randomized mechanisms with a justification of how this will optimize privacy and utility across different analysis types.  We are also asking you to propose a dataset that you believe would make a good use case for your proposed algorithm, and provide a means of comparing your algorithm and other algorithms.

All submissions must be made using the submission form provided on HeroX website.  Submissions will be judged using the listed criteria and scoring scheme. Challenge Sponsor has the right to make updates and/or make any changes at any time during the Challenge. (see official rules)

Teams that participate in the HeroX challenge, as well as newly formed teams that did not participate, can proceed to a leader-board-driven competition on Topcoder, the Algorithm Competition #1.  It is anticipated that Competition #1 will be followed by iterating improvements in the Algorithm Sprint, and finish with a final penultimate Challenge to further boost performance in the Algorithm Competition #2.  Where a competitor’s algorithm falls with respect to the utility-privacy frontier curve will determine who wins subsequent Topcoder Competitions.  The final review and decision of the judge will be announced in accordance of the rules on this site.

Schedule for Stage 1

Pre-registration began           February 1, 2018

Open to submissions             May 1, 2018 @12noon ET

Submission deadline             August 2, 2018 @ 5pm ET

People’s Choice Voting          August 14 – August 28, 2018

Winners Announced              September 12, 2018

Registration

To register for the challenge competition:

  • Go to the HeroX website.
  • Register with a username and password.
  • Official entries are accepted only through the HeroX platform on or before 5 PM ET August 2, 2018.
  • Registration for Stages 2 and 3 will take place in September and November 2018 through the TopCoder platform.  Announcements will be posted to Challenge.gov for the final two ‘algorithm’ stages.  Check back for details.

 

Judging Criteria

Analysis Class - Regression: Differential Privacy Capability and Utility - 15%

The balance of privacy and utility protected and the quality of evidence that privacy/utility will be protected at this level for each type of data (numerical, geo-spatial, and categorical data).

Analysis Class - Classification: Differential Privacy Capability and Utility - 15%

The balance of privacy and utility protected and the quality of evidence that privacy/utility will be protected at this level for each type of data (numerical, geo-spatial, and categorical data).

Analysis Class - Clustering: Differential Privacy Capability and Utility - 15%

The balance of privacy and utility protected and the quality of evidence that privacy/utility will be protected at this level for each type of data (numerical, geo-spatial, and categorical data).

Analysis Class - Unknown research question: Differential Privacy Capability and Utility - 15%

How does the solution handle a case where a dataset needs privacy protected, but the research questions are unknown?

Thoroughness in Self-Evaluation - 5%

The competitor answered the questions thoroughly, including the question about what use cases the Solution would not handle well.

Innovation - 20%

Subjective determination of uniqueness and likeliness to lead to greater future improvements than other Solutions

Computing Requirements/ Feasibility - 5%

Feasibility of using this Solution for larger volume use cases

Robustness & Generalizability - 5%

The Solution handles the provided classes and types of data well and can handle other use case classes and types of data. This could also include the ability to vary the balance between privacy and utility.

Dataset suggestion - 5%

A dataset is proposed that contains numerical, geo-spatial, and classification types of data as well as existing exploratory data analysis like regression, clustering, and classification analysis.

How to Enter

How do I win?

NOTE: All submissions for this challenge must be made on the HeroX website.

To be eligible for an award, your proposal must, at minimum:

  • Meet the eligibility requirements stated below and in the Challenge Specific Agreement.
  • Satisfy the Judging Scorecard requirements
  • Thoughtfully address the Submission Form questions (submission link)
  • Be scored higher than your competitors!
Prizes
Stage 1 Concept Paper $50,000.00 $15k Grand Prize - $10k Runner up - $5k Honorable Mention - $5k (up to 4) People's Choice
Stage 2 - Check back on challenge.gov in September 2018. Stage 2 will be an Algorithm Contest based on concepts from Stage 1, with prizes totaling $65k.
Stage 3 - Check back on challenge.gov in November 2018. Stage 3 is the final stage, an Algorithm Contest based on Stage 2 results with prizes totaling $75k.

3 Discussions for "The Unlinkable Data Challenge: Advancing Methods in Differential Privacy"

  • tmanley
    Hello! My sincere apologies for the delay. We like your way of thinking and your interest in the challenge. We are collecting all questions in one place so, would you mind asking your question again on the HeroX website? Here is the link https://www.herox.com/UnlinkableDataChallenge/update/2060 We will post answers to all questions by the end of next week, as we are formally collecting questions until June 15th. Feel free to ask questions anytime.

  • Show Replies [+]
    Seems to me that it ought to be in scope for the competition to submit a solution where no de-identification is done on the original (input) data, the data is kept secure before and during computation, and the results of the computation are then protected by differential privacy noise added after the computation is done. In particular, I'm thinking of a solution that uses secure computation to compute on the data while it remains encrypted, and then apply differential privacy before the result is revealed. Would that be in scope, sponsors of the challenge?

    • Reply
      Robert Armstead
      I tend to agree with you views on this and am also interested in the response. Seems to me that the protection of the data initially would be of a higher priority. Also adding noise would only be needed for research that ranges are acceptable. For any type of medical research, the numbers and such would have to be spot-on accurate in order for the researchers to form conclusions. Another thing that I was thinking of was adding a person into the equation to ensure that data was properly de-identified (check and balances) prior to release. Thoughts?

Add to the Discussion

Solutions
No solutions have been posted for this challenge yet.
Rules

Challenge-Specific Agreement

PLEASE READ THIS CAREFULLY! You (“Innovator” or “Participant”) and NIST (“Challenge Sponsor”) are entering into this Challenge-Specific Agreement (“CSA”) for this particular incentive-based competition (“Challenge”) only. In order to participate in this Challenge, Innovator must accept these terms, and therefore should take the time to understand them. This CSA includes the NIST Official Rules.

1.   If Innovator clicks “Accept” and proceeds to register for this Challenge, this CSA will be a valid and binding agreement between Innovator and Challenge Sponsor, and is in addition to the existing HeroX Terms of Use for all purposes relating to this Challenge. Innovator should print and keep a copy of this CSA. No provisions that Innovator may have agreed to that are specific to any other individual challenge will apply.

In the event of any discrepancy or inconsistency between the terms and conditions of the official rules and disclosures or other statements contained in any Competition materials, including but not limited to the Competition submission form, Competition website and use terms, HeroX terms of participation, advertising (including but not limited to television, print, radio or online ads), the terms and conditions of the NIST Official Rules on Challenge.gov as specified within this Challenge Specific Agreement shall control.

2.    America COMPETES Reauthorization Act of 2010: All challenge and prize competitions shall be performed in accordance with the America COMPETES Reauthorization Act of 2010, Pub. Law 111-358, title I, § 105(a), Jan. 4, 2011, as amended, codified at 15 U.S.C. § 3719 (hereinafter “America COMPETES Act”).

 

3.    Eligibility: Each Competition Participant (individual, team, or legal entity) is required to register on the HeroX NIST Unlinkable Data Challenge:  Advancing Methods in Differential Privacy website. There shall be one Official Representative for each Competition Participant.  The Official Representative must provide a username (which may serve as a team or affiliation name), email address, and affirm that he/she has read and consents to be governed by the Competition Rules.  At NIST’s discretion, any violation of this rule will be grounds for disqualification from the Competition. Multiple individuals and/or legal entities may collaborate as a team to submit a single entry, in which case the designated Official Representative will be responsible for meeting all entry and evaluation requirements.  Participation is subject to all U.S. federal, state and local laws and regulations. Participants, including individuals and private entities, must not have been convicted of a felony criminal violation under any Federal law within the preceding 24 months and must not have any unpaid Federal tax liability that has been assessed, for which all judicial and administrative remedies have been exhausted or have lapsed, and that is not being paid in a timely manner pursuant to an agreement with the authority responsible for collecting the tax liability.  Participants must not be suspended, debarred, or otherwise excluded from doing business with the Federal Government.  Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity.  Any other individuals or legal entities involved with the design, production, execution, distribution or evaluation of the NIST Unlinkable Data Challenge: Advancing Methods in Differential Privacy website are not eligible to participate.

 

To be eligible for a cash prize:

  1. A Participant (whether an individual, team, or legal entity) must have registered to participate and complied with all of the requirements under section 3719 of title 15, United States Code as contained herein.
  2. At the time of Entry, the Official Representative (individual or team lead, in the case of a group project) must be age 18 or older and a U.S. citizen or permanent resident of the United States or its territories.
  3. In the case of a private entity, the business shall be incorporated in and maintain a primary place of business in the United States or its territories.
  4. Participants may not be a Federal entity or Federal employee acting within the scope of their employment. NIST employees are not eligible to participate. Non-NIST Federal employees acting in their personal capacities should consult with their respective agency ethics officials to determine whether their participation in this Competition is permissible.
  5. A Participant shall not be deemed ineligible because the Participant consulted with Federal employees or used Federal facilities in preparing its submission to the NIST Unlinkable Data Challenge: Advancing Methods in Differential Privacy Prize Competition if the Federal employees and facilities are made available to all Participants on an equitable basis.

4. Submissions: By participating in this Challenge, Innovator may submit to Challenge Sponsor submission materials (“Submission”), as outlined in these NIST Official Rules on Challenge.gov and the Challenge Guidelines specific to this Challenge on HeroX.com. By submitting a Submission, Innovator thereby agrees to provide reasonable assistance and additional information concerning the Submission to Challenge Sponsor, if requested.

 

5. Warranties: By submitting an Entry, the Participant represents and warrants that all information submitted is true and complete to the best of the Participant’s knowledge, that the Participant has the right and authority to submit the Entry on the Participant’s own behalf or on behalf of the persons and entities that the Participant specifies within the Entry, and that the Entry (both the information and materials submitted in the Entry and the underlying technology/method/idea/treatment protocol/solution described in the Entry):

  1. Is the Participant’s own original work, or is submitted by permission with full and proper credit given within the Entry;
  2. Does not contain trade secrets (the Participant’s or anyone else’s);
  3. Does not knowingly violate or infringe upon the patent rights, industrial design rights, copyrights, trademarks, rights of privacy, publicity or other intellectual property or other rights of any person or entity;
  4. Does not contain malicious code, such as viruses, malware, timebombs, cancelbots, worms, Trojan horses or other potentially harmful programs or other material or information;
  5. Does not and will not violate any applicable law, statute, ordinance, rule or regulation, including, without limitation, United States export laws and regulations, including but not limited to, the International Traffic in Arms Regulations and the Department of Commerce Export Regulations; and
  6. Does not trigger any reporting or royalty or other obligation to any third party.

 

6. Intellectual Property: Any applicable intellectual property rights to an Entry will remain with the Participant.   By participating in the prize challenge, the Participant is not granting any rights in any patents, pending patent applications, or copyrights related to the technology described in the Entry.  However, by submitting an Entry, the Participant is granting NIST, NASA, and any parties acting on their behalf certain limited rights as set forth herein.

  1. By submitting an Entry, the Participant grants to NIST, NASA, and any parties acting on their behalf the right to review the Entry, to describe the Entry in any materials created in connection with this competition, and to screen and evaluate the Entry. NIST and NASA, and any parties acting on their behalf will also have the right to publicize Participant’s name and, as applicable, the names of Participant’s team members and/or Organization which participated in submitting the Entry following the conclusion of the Competition.
  2. As part of its submission, the Participant must provide written consent granting NIST, NASA, and any parties acting on their behalf, a royalty-free, non-exclusive, irrevocable, worldwide license to display publicly and use for promotional purposes the Participant’s entry (“demonstration license”). This demonstration license includes posting or linking to the Participant’s entry on NIST and NASA’s websites, including the Competition Website, and partner websites, and inclusion of the Participant’s Entry in any other media, worldwide.

 

7. Trade Secret Information: By making a submission to this prize competition, the Participant agrees that no part of its submission includes any Trade Secret information, ideas or products. All submissions to this prize competition are deemed non-proprietary.  Since NIST does not wish to receive or hold any submitted materials “in confidence” it is agreed that, with respect to the Participant’s Entry, no confidential or fiduciary relationship or obligation of secrecy is established between NIST, NASA, or any parties acting on their behalf and the Participant, the Participant’s team, or the company or institution the Participant represents when submitting an Entry, or any other person or entity associated with any part of the Participant’s Entry.

 

8. Liability: Participants shall agree to assume any and all risks and waive claims against the Federal Government and its related entities, except in the case of willful misconduct, for any injury, death, damage, or loss of property, revenue, or profits, whether direct, indirect, or consequential, arising from participation in this prize competition, whether the injury, death, damage, or loss arises through negligence or otherwise.

 

9. Insurance: Participants are not required to obtain liability insurance for this Competition.

 

10. Indemnification: Participants shall agree to indemnify the Federal Government against third party claims for damages arising from or related to Challenge activities.

 

11. Changes and Cancellation: Challenge Sponsor has the right to make updates and/or make any changes at any time during the Challenge.  Innovators are responsible for regularly reviewing the official rules on Challenge.gov [https://www.challenge.gov/challenge/the-unlinkable-data-challenge-advancing-methods-in-differential-privacy] and updates on the HeroX site [https://www.herox.com/UnlinkableDataChallenge] to ensure they are meeting all rules and requirements of the Challenge.  Challenge Sponsor has the right to cancel the Challenge at any time, without warning or explanation, and to subsequently remove the Prize completely.

 

12. Payments: The prize competition winners will be paid prizes directly from NIST. Prior to payment, winners will be required to verify eligibility. The verification process with the agency includes providing the full legal name, tax identification number or social security number, routing number and banking account to which the prize money can be deposited directly.

 

13. Existing Laws: The Federal Government shall not, by virtue of conducting this prize competition, be responsible for compliance by Participants in the prize competition with Federal law, including licensing, export control, and nonproliferation laws, and related regulations.

 

Participation is subject to all U.S. federal, state and local laws and regulations. Participants are responsible for checking applicable laws and regulations in their jurisdiction(s) before participating in the prize competition to ensure that their participation is legal.   Individuals entering on behalf of or representing a company, institution or other legal entity are responsible for confirming that their entry does not violate any policies of that company, institution or legal entity.

 

14.  Registration and Submissions: Submissions must be made online (only), via upload to the HeroX website, on or before 5pm ET on August 2, 2018. All uploads must be in PDF format.  No late submissions will be accepted.

15.  Selection of Winners: Based on the winning criteria, prizes will be awarded per the Judging Criteria section in the Challenge Guidelines.  In the case of a tie, the winner(s) will be selected based on the highest votes from the judges.

16.  Judging: The final determination of the winners will be made at the sole discretion of NIST.  Scores and feedback from NIST will not be shared.

17. Popular Choice Awards Voting: The $5,000 popular choice awards will be awarded based on number of votes received during the voting period. A competitor is eligible to win both a judges’ award and the popular choice award.

  1. All votes are subject to review. Any competitor using unfair methods to solicit votes will be automatically disqualified from the challenge.
  2. Entries that are eligible for the Voting stage will become viewable to the public. Make sure that if your entry moves on to the Voting stage, that you’re OK with anyone seeing it!  Depending on the number of entries received, either all or a selected shortlist will move on to the Voting stage.

 

18. Privacy Advisory: The HeroX.com website is hosted by a private entity and is not a service of NIST.  The solicitation and collection of your personal or individually identifiable information is subject to the host’s privacy and security policies and will not be shared with NIST unless you win the Challenge. Challenge winners’ personally identifiable information must be made available to NIST in order to collect an award.

Submit Solution
Submissions for this competition are being accepted on a third-party site. Please visit the external site for instructions on submitting: https://www.herox.com/UnlinkableDataChallenge
Challenge Followers
Public Profile: 0
Private Profile: 3