metadata
base_model: Alibaba-NLP/gte-large-en-v1.5
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:586
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: >-
Explain the spectrum of openness in AI systems as described in the
document. How do open-source AI systems differ from fully closed AI
systems in terms of accessibility and innovation?
sentences:
- "targets of cyber attacks; or\n\_ \_ \_ \_ \_\_(iii)\_ permitting the evasion of human control or oversight through\nmeans of deception or obfuscation.\nModels meet this definition even if they are provided to end users with\ntechnical safeguards that attempt to prevent users from taking advantage of\nthe relevant unsafe capabilities.\_\n\_ \_ \_(l)\_ The term “Federal law enforcement agency” has the meaning set forth\nin section 21(a) of Executive Order 14074 of May 25, 2022 (Advancing\nEffective, Accountable Policing and Criminal Justice Practices To Enhance\nPublic Trust and Public Safety).\n\_ \_ \_(m)\_ The term “floating-point operation” means any mathematical\noperation or assignment involving floating-point numbers, which are a\nsubset of the real numbers typically represented on computers by an integer\nof fixed precision scaled by an integer exponent of a fixed base.\n\_ \_ \_(n)\_ The term “foreign person” has the meaning set forth in section 5(c) of\nExecutive Order 13984 of January 19, 2021 (Taking Additional Steps To\nAddress the National Emergency With Respect to Significant Malicious\nCyber-Enabled Activities).\n\_ \_ \_(o)\_ The terms “foreign reseller” and “foreign reseller of United States\nInfrastructure as a Service Products” mean a foreign person who has\nestablished an Infrastructure as a Service Account to provide Infrastructure\nas a Service Products subsequently, in whole or in part, to a third party.\n\_ \_ \_(p)\_ The term “generative AI” means the class of AI models that emulate\nthe structure and characteristics of input data in order to generate derived\nsynthetic content.\_ This can include images, videos, audio, text, and other\ndigital content.\n\_ \_ \_(q)\_ The terms “Infrastructure as a Service Product,” “United States\nInfrastructure as a Service Product,” “United States Infrastructure as a\nService Provider,” and “Infrastructure as a Service Account” each have the\nrespective meanings given to those terms in section 5 of Executive Order\n13984.\n\_ \_ \_(r)\_ The term “integer operation” means any mathematical operation or\nassignment involving only integers, or whole numbers expressed without a\ndecimal point.05/10/2024, 16:36 Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence | The White House\nhttps://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artific… 7/59"
- "AI safety, enable next-generation medical diagnoses and further other\ncritical AI priorities.\n\0\0 Released a for designing safe, secure, and trustworthy AI tools\nfor use in education. The Department of Education’s guide discusses\nhow developers of educational technologies can design AI that benefits\nstudents and teachers while advancing equity, civil rights, trust, and\ntransparency. This work builds on the Department’s 2023 \noutlining recommendations for the use of AI in teaching and learning.\n\0\0 Published guidance on evaluating the eligibility of patent claims\ninvolving inventions related to AI technology,\_as well as other\nemerging technologies. The guidance by the U.S. Patent and Trademark\nOffice will guide those inventing in the AI space to protect their AI\ninventions and assist patent examiners reviewing applications for\npatents on AI inventions.\n\0\0 Issued a on federal research and development (R&D) to\nadvance trustworthy AI over the past four years. The report by the\nNational Science and Technology Council examines an annual federal AI\nR&D budget of nearly $3 billion.\n\0\0 Launched a $23 million initiative to promote the use of privacy-\nenhancing technologies to solve real-world problems, including\nrelated to AI.\_Working with industry and agency partners, NSF will\ninvest through its new Privacy-preserving Data Sharing in Practice\nprogram in efforts to apply, mature, and scale privacy-enhancing\ntechnologies for specific use cases and establish testbeds to accelerate\ntheir adoption.\n\0\0 Announced millions of dollars in further investments to advance\nresponsible AI development and use throughout our society. These\ninclude $30 million invested through NSF’s Experiential Learning in\nEmerging and Novel Technologies program—which supports inclusive\nexperiential learning in fields like AI—and $10 million through NSF’s\nExpandAI program, which helps build capacity in AI research at\nminority-serving institutions while fostering the development of a\ndiverse, AI-ready workforce.\nAdvancing U.S. Leadership Abroad\nPresident Biden’s Executive Order emphasized that the United States lead\nglobal efforts to unlock AI’s potential and meet its challenges. To advance\nU.S. leadership on AI, agencies have:guide\nreport\nreport05/10/2024, 16:35 FACT SHEET: Biden-Harris Administration Announces New AI Actions and Receives Additional Major Voluntary Commitment on AI | The…\nhttps://www.whitehouse.gov/briefing-room/statements-releases/2024/07/26/fact-sheet-biden-harris-administration-announces-new-ai-actions-and-receives-addit… 4/10"
- >-
50 Governing AI for Humanity processes such as the recent scientific
report
on the risks of advanced AI commissioned by
the United Kingdom,25 and relevant regional
organizations.
e. A steering committee would develop a research
agenda ensuring the inclusivity of views and
incorporation of ethical considerations, oversee
the allocation of resources, foster collaboration
with a network of academic institutions and
other stakeholders, and review the panel’s
activities and deliverables.100 By drawing on the unique convening power
of the
United Nations and inclusive global reach across
stakeholder groups, an international scientific panel
can deliver trusted scientific collaboration processes
and outputs and correct information asymmetries
in ways that address the representation and
coordination gaps identified in paragraphs 66 and
73, thereby promoting equitable and effective
international AI governance.
Among the topics discussed in our consultations was the ongoing debate
over open versus closed AI systems.
AI systems that are open in varying degrees are often referred to as
“open-source AI”, but this is somewhat of a
misnomer when compared with open-source software (code). It is important
to recognize that openness in AI
systems is more of a spectrum than a single attribute.
One article explained that a “fully closed AI system is only accessible
to a particular group. It could be an AI
developer company or a specific group within it, mainly for internal
research and development purposes. On the
other hand, more open systems may allow public access or make available
certain parts, such as data, code, or
model characteristics, to facilitate external AI development.”a
Open-source AI systems in the generative AI field present both risks and
opportunities. Companies often cite “AI
safety” as a reason for not disclosing system specifications, reflecting
the ongoing tension between open and
closed approaches in the industry. Debates typically revolve around two
extremes: full openness, which entails
sharing all model components and data sets; and partial openness, which
involves disclosing only model weights.
Open-source AI systems encourage innovation and are often a requirement
for public funding. On the open
extreme of the spectrum, when the underlying code is made freely
available, developers around the world can
experiment, improve and create new applications. This fosters a
collaborative environment where ideas and
expertise are readily shared. Some industry leaders argue that this
openness is vital to innovation and economic
growth.
However, in most cases, open-source AI models are available as
application programming interfaces. In this case,
the original code is not shared, the original weights are never changed
and model updates become new models.
Additionally, open-source models tend to be smaller and more
transparent. This transparency can build trust,
allow for ethical considerations to be proactively addressed, and
support validation and replication because users
can examine the inner workings of the AI system, understand its
decision-making process and identify potential
biases.Box 9: Open versus closed AI systems
a Angela Luna, “The open or closed AI dilemma”, 2 May 2024. Available at
https://bipartisanpolicy.org/blog/the-open-or-closed-ai-dilemma .
25 International Scientific Report on the Safety of Advanced AI: Interim
Report. Available at
https://gov.uk/government/publications/international-scientific-report-
on-the-safety-of-advanced-ai .
- source_sentence: >-
What role does the report propose for the United Nations in establishing a
governance regime for AI, and how does it envision this regime
contributing to a new social contract that protects vulnerable
populations?
sentences:
- >-
HUMAN ALTERNATIVES,
CONSIDERATION, AND
FALLBACK
HOW THESE PRINCIPLES CAN MOVE INTO PRACTICE
Real-life examples of how these principles can become reality, through
laws, policies, and practical
technical and sociotechnical approaches to protecting rights,
opportunities, and access.
Healthcare “navigators” help people find their way through online signup
forms to choose
and obtain healthcare. A Navigator is “an individual or organization
that's trained and able to help
consumers, small businesses, and their employees as they look for health
coverage options through the
Marketplace (a government web site), including completing eligibility
and enrollment forms.”106 For
the 2022 plan year, the Biden-Harris Administration increased funding so
that grantee organizations could
“train and certify more than 1,500 Navigators to help uninsured
consumers find affordable and comprehensive
health coverage. ”107
The customer service industry has successfully integrated automated
services such as
chat-bots and AI-driven call response systems with escalation to a human
support team.
108 Many businesses now use partially automated customer service
platforms that help answer customer
questions and compile common problems for human agents to review. These
integrated human-AI
systems allow companies to provide faster customer care while
maintaining human agents to answer
calls or otherwise respond to complicated requests. Using both AI and
human agents is viewed as key to
successful customer service.109
Ballot curing laws in at least 24 states require a fallback system that
allows voters to
correct their ballot and have it counted in the case that a voter
signature matching algorithm incorrectly flags their ballot as invalid
or there is another issue with their ballot, and review by an election
official does not rectify the problem. Some federal courts have found
that such cure procedures are constitutionally required.
110 Ballot
curing processes vary among states, and include direct phone calls,
emails, or mail contact by election
officials.111 Voters are asked to provide alternative information or a
new signature to verify the validity of their
ballot.
52
- >-
SECTION TITLE
HUMAN ALTERNATIVES , C ONSIDERATION , AND FALLBACK
You should be able to opt out, where appropriate, and have access to a
person who can quickly
consider and remedy problems you encounter. You should be able to opt
out from automated systems in
favor of a human alternative, where appropriate. Appropriateness should
be determined based on reasonable expectations in a given context and
with a focus on ensuring broad accessibility and protecting the public
from especially harmful impacts. In some cases, a human or other
alternative may be required by law. You should have access to timely
human consideration and remedy by a fallback and escalation process if
an automated system fails, it produces an error, or you would like to
appeal or contest its impacts on you. Human consideration and fallback
should be accessible, equitable, effective, maintained, accompanied by
appropriate operator training, and should not impose an unreasonable
burden on the public. Automated systems with an intended use within
sensi
-
tive domains, including, but not limited to, criminal justice,
employment, education, and health, should additional -
ly be tailored to the purpose, provide meaningful access for oversight,
include training for any people interacting with the system, and
incorporate human consideration for adverse or high-risk decisions.
Reporting that includes a description of these human governance
processes and assessment of their timeliness, accessibility, outcomes,
and effectiveness should be made public whenever possible.
Definitions for key terms in The Blueprint for an AI Bill of Rights can
be found in Applying the Blueprint for an AI Bill of Rights.
Accompanying analysis and tools for actualizing each principle can be
found in the Technical Companion.
7
- |-
Final Report 21E. Reflections on institutional
models
lxiv Discussions about AI often resolve into extremes.
In our consultations around the world, we engaged
with those who see a future of boundless goods
provided by ever-cheaper, ever-more-helpful AI
systems. We also spoke with those wary of darker
futures, of division and unemployment, and even
extinction.8
lxv We do not know whether the utopian or dystopian
future is more likely. Equally, we are mindful that
the technology may go in a direction that does
away with this duality. This report focuses on
the near-term opportunities and risks, based on
science and grounded in fact.
lxvi The seven recommendations outlined above offer
our best hope for reaping the benefits of AI, while
minimizing and mitigating the risks, as AI continues
evolving. We are also mindful of the practical
challenges to international institution-building
on a larger scale. This is why we are proposing a
networked institutional approach, with light and
agile support. If or when risks become more acute
and the stakes for opportunities escalate, such
calculations may change.
lxvii The world wars led to the modern international
system; the development of ever-more-powerful
chemical, biological and nuclear weapons led
to regimes limiting their spread and promoting
peaceful uses of the underlying technologies.
Evolving understanding of our common humanity
led to the modern human rights system and our
ongoing commitment to the SDGs for all. Climate
change evolved from a niche concern to a global
challenge.lxviii AI may similarly rise to a level that requires more
resources and more authority than is proposed
in the above-mentioned recommendations,
into harder functions of norm elaboration,
implementation, monitoring, verification and
validation, enforcement, accountability, remedies
for harm and emergency responses. Reflecting on
such institutional models, therefore, is prudent. The
final section of this report seeks to contribute to
that effort.
4. A call to action
lxix We remain optimistic about the future with AI and
its positive potential. That optimism depends,
however, on realism about the risks and the
inadequacy of structures and incentives currently
in place. The technology is too important, and the
stakes are too high, to rely only on market forces
and a fragmented patchwork of national and
multilateral action.
lxx The United Nations can be the vehicle for a new
social contract for AI that ensures global buy-
in for a governance regime which protects and
empowers us all. Such a social contract will ensure
that opportunities are fairly distributed, and the
risks are not loaded on to the most vulnerable – or
passed on to future generations, as we have seen,
tragically, with climate change.
lxxi As a group and as individuals from across many
fields of expertise, organizations and parts of the
world, we look forward to continuing this crucial
conversation. Together with the many others we
have connected with on this journey, and the global
community they represent, we hope that this report
contributes to our combined efforts to govern AI
for humanity.
8 See https://safe.ai/work/statement-on-ai-risk .
- source_sentence: >-
What are the potential consequences of coordination gaps between various
AI governance initiatives, as highlighted in the context information?
sentences:
- |-
44 Governing AI for Humanity B. Coordination gaps
72 The ongoing emergence and evolution of AI
governance initiatives are not guaranteed to
work together effectively for humanity. Instead,
coordination gaps have appeared. Effective
handshaking between the selective plurilateral
initiatives (see fig. 8) and other regional initiatives is
not assured, risking incompatibility between regions.
73 Nor are there global mechanisms for all international
standards development organizations (see fig. 7),
international scientific research initiatives or AI
capacity-building initiatives to coordinate with each
other, undermining interoperability of approaches
and resulting in fragmentation. The resulting
coordination gaps between various sub-global
initiatives are in some cases best addressed at the
global level.
74 A separate set of coordination gaps arise within
the United Nations system, reflected in the array of
diverse United Nations documents and initiatives
in relation to AI. Figure 9 shows 27 United Nations-
related instruments in specific domains that may
apply to AI – 23 of them are binding and will require
interpretation as they pertain to AI. A further 29
domain-level documents from the United Nations
and related organizations focus specifically on AI,
none of which are binding.17 In some cases, these
can address AI risks and harness AI benefits in
specific domains.75 The level of activity shows the importance of AI
to United Nations programmes. As AI expands to
affect ever-wider aspects of society, there will be
growing calls for diverse parts of the United Nations
system to act, including through binding norms.
It also shows the ad hoc nature of the responses,
which have largely developed organically in specific
domains and without an overarching strategy. The
resulting coordination gaps invite overlaps and
hinder interoperability and impact.
76 The number and diversity of approaches are a sign
that the United Nations system is responding to
an emerging issue. With proper orchestration, and
in combination with processes taking a holistic
approach, these efforts can offer an efficient and
sustainable pathway to inclusive international AI
governance in specific domains. This could enable
meaningful, harmonized and coordinated impacts
on areas such as health, education, technical
standards and ethics, instead of merely contributing
to the proliferation of initiatives and institutions
in this growing field. International law, including
international human rights law, provides a shared
normative foundation for all AI-related efforts,
thereby facilitating coordination and coherence.
- "\0\0 Issued a comprehensive plan for U.S. engagement on global AI\nstandards.\_The plan, developed by the NIST, incorporates broad public\nand private-sector input, identifies objectives and priority areas for AI\nstandards work, and lays out actions for U.S. stakeholders including U.S.\nagencies. NIST and others agencies will report on priority actions in 180\ndays.\_\n\0\0 Developed for managing risks to human rights posed by AI.\nThe Department of State’s “Risk Management Profile for AI and Human\nRights”—developed in close coordination with NIST and the U.S. Agency\nfor International Development—recommends actions based on the NIST\nAI Risk Management Framework to governments, the private sector, and\ncivil society worldwide, to identify and manage risks to human rights\narising from the design, development, deployment, and use of AI.\_\n\0\0 Launched a global network of AI Safety Institutes and other\ngovernment-backed scientific offices to advance AI safety at a technical\nlevel.\_This network will accelerate critical information exchange and\ndrive toward common or compatible safety evaluations and policies.\n\0\0 Launched a landmark United Nations General Assembly resolution.\nThe unanimously adopted resolution, with more than 100 co-sponsors,\nlays out a common vision for countries around the world to promote the\nsafe and secure use of AI to address global challenges.\n\0\0 Expanded global support for the U.S.-led Political Declaration on the\nResponsible Military Use of Artificial Intelligence and\nAutonomy.\_\_Fifty-five nations now endorse the political declaration,\nwhich outlines a set of norms for the responsible development,\ndeployment, and use of military AI capabilities.\nThe Table below summarizes many of the activities that federal agencies\nhave completed in response to the Executive Order:guidance05/10/2024, 16:35 FACT SHEET: Biden-Harris Administration Announces New AI Actions and Receives Additional Major Voluntary Commitment on AI | The…\nhttps://www.whitehouse.gov/briefing-room/statements-releases/2024/07/26/fact-sheet-biden-harris-administration-announces-new-ai-actions-and-receives-addit… 5/10"
- >-
Final Report 55f. In addition, diverse stakeholders – in particular
technology companies and civil society
representatives – could be invited to engage
through existing institutions detailed below, as
well as policy workshops on particular aspects
of AI governance such as limits (if any) of open-
source approaches to the most advanced forms
of AI, thresholds for tracking and reporting of
AI incidents, application of human rights law to
novel use cases, or the use of competition law/
antitrust to address concentrations of power
among technology companies.30
g. The proposed AI office could also curate a
repository of AI governance examples, including
legislation, policies and institutions from
around the world for consideration of the policy
dialogue, working with existing efforts, such as
OECD.
109 Notwithstanding the two General Assembly
resolutions on AI in 2024, there is currently
no mandated institutionalized dialogue on
AI governance at the United Nations that
corresponds to the reliably inclusive vision of this
recommendation. Similar processes do exist at
the international level, but primarily in regional or
plurilateral constellations (para. 57), which are not
reliably inclusive and global.
110 Complementing a fluid process of plurilateral and
regional AI summits,31 the United Nations can
offer a stable home for dialogue on AI governance.
Inclusion by design – a crucial requirement for
playing a stabilizing role in geopolitically delicate
times – can also address representation and
coordination gaps identified in paragraphs 64 and
72, promoting more effective collective action on AI
governance in the common interest of all countries. AI standards
exchange
Recommendation 3: AI standards exchange
We recommend the creation of an AI standards
exchange, bringing together representatives from
national and international standard-development
organizations, technology companies, civil society
and representatives from the international scientific
panel. It would be tasked with:
a. Developing and maintaining a register of
definitions and applicable standards for
measuring and evaluating AI systems;
b. Debating and evaluating the standards and the
processes for creating them; and
c. Identifying gaps where new standards are
needed.
111 When AI systems were first explored, few standards
existed to help to navigate or measure this new
frontier. The Turing Test – of whether a machine can
exhibit behaviour equivalent to (or indistinguishable
from) a human being – captured the popular
imagination, but is of more cultural than scientific
significance. Indeed, it is telling that some of
the greatest computational advances have been
measured by their success in games, such as when
a computer could beat humans at chess, Go, poker
or Jeopardy. Such measures were easily understood
by non-specialists, but were neither rigorous nor
particularly scientific.
112 More recently, there has been a proliferation of
standards. Figure 13 illustrates the increasing
number of relevant standards adopted by ITU, the
International Organization for Standardization (ISO),
the International Electrotechnical Commission
(IEC) and the Institute of Electrical and Electronics
Engineers (IEEE).32
30 Such a gathering could also provide an opportunity for
multi-stakeholder debate of any hardening of the global governance of
AI. These might include, for
example, prohibitions on the development of uncontainable or
uncontrollable AI systems, or requirements that all AI systems be
sufficiently transparent so that
their consequences can be traced back to a legal actor that can assume
responsibility for them.
31 Although multiple AI summits have helped a subset of 20–30 countries
to align on AI safety issues, participation has been inconsistent:
Brazil, China and
Ireland endorsed the Bletchley Declaration in November 2023, but not the
Seoul Ministerial Statement six months later (see fig. 12). Conversely,
Mexico and
New Zealand endorsed the Seoul Ministerial Statement, but did not
endorse the Bletchley Declaration.
32 Many new standards are also emerging at the national and
multinational levels, such as the United States White House Voluntary AI
Commitments and the
European Union Codes of Practice for the AI Act.
- source_sentence: >-
Describe the minimum set of criteria that should be included in the
incident reporting process for GAI systems, according to the
organizational practices established for identifying incidents.
sentences:
- >-
APPENDIX
Summaries of Additional Engagements:
•OSTP created an email address ( [email protected] ) to solicit
comments from the public on the use of
artificial intelligence and other data-driven technologies in their
lives.
•OSTP issued a Request For Information (RFI) on the use and governance
of biometric technologies.113 The
purpose of this RFI was to understand the extent and variety of
biometric technologies in past, current, or
planned use; the domains in which these technologies are being used; the
entities making use of them; currentprinciples, practices, or policies
governing their use; and the stakeholders that are, or may be, impacted
by theiruse or regulation. The 130 responses to this RFI are available
in full online
114 and were submitted by the below
listed organizations and individuals:
Accenture
Access Now ACT | The App Association AHIP
AIethicist.org
Airlines for America Alliance for Automotive Innovation Amelia
Winger-Bearskin American Civil Liberties Union American Civil Liberties
Union of Massachusetts American Medical Association ARTICLE19 Attorneys
General of the District of Columbia, Illinois, Maryland, Michigan,
Minnesota, New York, North Carolina, Oregon, Vermont, and Washington
Avanade Aware Barbara Evans Better Identity Coalition Bipartisan Policy
Center Brandon L. Garrett and Cynthia Rudin Brian Krupp Brooklyn
Defender Services BSA | The Software Alliance Carnegie Mellon University
Center for Democracy & Technology Center for New Democratic Processes
Center for Research and Education on Accessible Technology and
Experiences at University of Washington, Devva Kasnitz, L Jean Camp,
Jonathan Lazar, Harry Hochheiser Center on Privacy & Technology at
Georgetown Law Cisco Systems City of Portland Smart City PDX Program
CLEAR Clearview AI Cognoa Color of Change Common Sense Media Computing
Community Consortium at Computing Research Association Connected Health
Initiative Consumer Technology Association Courtney Radsch Coworker
Cyber Farm Labs Data & Society Research Institute Data for Black Lives
Data to Actionable Knowledge Lab at Harvard University Deloitte Dev
Technology Group Digital Therapeutics Alliance Digital Welfare State &
Human Rights Project and Center for Human Rights and Global Justice at
New York University School of Law, and Temple University Institute for
Law, Innovation & Technology Dignari Douglas Goddard Edgar Dworsky
Electronic Frontier Foundation Electronic Privacy Information Center,
Center for Digital Democracy, and Consumer Federation of America FaceTec
Fight for the Future Ganesh Mani Georgia Tech Research Institute Google
Health Information Technology Research and Development Interagency
Working Group HireVue HR Policy Association ID.me Identity and Data
Sciences Laboratory at Science Applications International Corporation
Information Technology and Innovation Foundation Information Technology
Industry Council Innocence Project Institute for Human-Centered
Artificial Intelligence at Stanford University Integrated Justice
Information Systems Institute International Association of Chiefs of
Police International Biometrics + Identity Association International
Business Machines Corporation International Committee of the Red Cross
Inventionphysics iProov Jacob Boudreau Jennifer K. Wagner, Dan Berger,
Margaret Hu, and Sara Katsanis Jonathan Barry-Blocker Joseph Turow Joy
Buolamwini Joy Mack Karen Bureau Lamont Gholston Lawyers’ Committee for
Civil Rights Under Law
60
- >-
19 GV-4.1-003 Establish policies, procedures, and processes for
oversight functions (e.g., senior
leadership, legal, compliance, including internal evaluation ) across
the GAI
lifecycle, from problem formulation and supply chains to system
decommission. Value Chain and Component
Integration
AI Actor Tasks: AI Deployment, AI Design, AI Development, Operation and
Monitoring
GOVERN 4.2: Organizational teams document the risks and potential
impacts of the AI technology they design, develop, deploy,
evaluate, and use, and they communicate about the impacts more
broadly.
Action ID Suggested Action GAI Risks
GV-4.2-001 Establish terms of use and terms of service for GAI systems
. Intellectual Property ; Dangerous ,
Violent, or Hateful Content ;
Obscene, Degrading, and/or
Abusive Content
GV-4.2-002 Include relevant AI Actors in the GAI system risk
identification process. Human -AI Configuration
GV-4.2-0 03 Verify that downstream GAI system impacts (such as the use
of third -party
plugins) are included in the impact documentation process. Value Chain
and Component
Integration
AI Actor Tasks: AI Deployment, AI Design, AI Development, Operation and
Monitoring
GOVERN 4.3: Organizational practices are in place to enable AI testing,
identification of incidents, and information sharing.
Action ID Suggested Action GAI Risks
GV4.3-- 001 Establish policies for measuring the effectiveness of
employed content
provenance methodologies (e.g., cryptography, watermarking,
steganography, etc.) Information Integrity
GV-4.3-002 Establish o rganizational practices to identify the minimum
set of criteria
necessary for GAI system incident reporting such as: System ID (auto
-generated
most likely), Title, Reporter, System/Source, Data Reported, Date of
Incident, Description, Impact(s), Stakeholder(s) Impacted. Information
Security
- >-
72 Governing AI for Humanity Box 15: Possible functions and
first-year deliverables of the AI office
The AI office should have a light structure and aim to be agile, trusted
and networked. Where necessary, it should
operate in a “hub and spoke” manner to connect to other parts of the
United Nations system and beyond.
Outreach could include serving as a key node in a so-called soft
coordination architecture between Member
States, plurilateral networks, civil society organizations, academia and
technology companies in a regime complex
that weaves together to solve problems collaboratively through
networking, and as a safe, trusted place to
convene on relevant topics. Ambitiously, it could become the glue that
helps to hold such other evolving networks
together.
Supporting the various initiatives proposed in this report includes the
important function of ensuring inclusiveness
at speed in delivering outputs such as scientific reports, governance
dialogue and identifying appropriate follow-
up entities.
Common understanding :
• Facilitate recruitment of and support the international scientific
panel.
Common ground :
• Service policy dialogues with multi-stakeholder inputs in support of
interoperability and policy learning.
An initial priority topic is the articulation of risk thresholds and
safety frameworks across jurisdictions
• Support ITU, ISO/IEC and IEEE on setting up the AI standards exchange.
Common benefits :
• Support the AI capacity development network with an initial focus on
building public interest AI capacity
among public officials and social entrepreneurs. Define the initial
network vision, outcomes, go vernance
structure, partnerships and operational mechanisms.
• Define the vision, outcomes, governance structure and operational
mechanisms for the global fund for AI,
and seek feedback from Member States, industry and civil society
stakeholders on the proposal, with a
view to funding initial projects within six months of establishment.
• Prepare and publish an annual list of prioritized investment areas to
guide both the global fund for AI and
investments outside that structure.
Coherent effort :
• Establish lightweight mechanisms that support Member States and other
relevant organizations to be
more connected, coordinated and effective in pursuing their global AI
governance efforts.
• Prepare initial frameworks to guide and monitor the AI office’s work,
including a global governance risk
taxonomy, a global AI policy landscape review and a global stakeholder
map.
• Develop and implement quarterly reporting and periodic in-person
presentations to Member States on
the AI office’s progress against its workplan and establish feedback
channels to support adjustments as
needed.
• Establish a steering committee jointly led by the AI office, ITU, UNC
TAD, UNESCO and other relevant
United Nations entities and organizations to accelerate the work of the
United Nations in service of the
functions above, and review progress of the accelerated efforts every
three months.
• Promote joint learning and development opportunities for Member State
representatives to support them
to carry out their responsibilities for global AI governance, in
cooperation with relevant United Nations
entities and organizations such as the United Nations Institute for
Training and Research and the United
Nations University.
- source_sentence: >-
What are some of the legal frameworks mentioned in the context that aim to
protect personal information, and how do they relate to data privacy
concerns?
sentences:
- >-
NOTICE &
EXPLANATION
WHAT SHOULD BE EXPECTED OF AUTOMATED SYSTEMS
The expectations for automated systems are meant to serve as a blueprint
for the development of additional
technical standards and practices that are tailored for particular
sectors and contexts.
Tailored to the level of risk. An assessment should be done to determine
the level of risk of the auto -
mated system. In settings where the consequences are high as determined
by a risk assessment, or extensive
oversight is expected (e.g., in criminal justice or some public sector
settings), explanatory mechanisms should be built into the system design
so that the system’s full behavior can be explained in advance (i.e.,
only fully transparent models should be used), rather than as an
after-the-decision interpretation. In other settings, the extent of
explanation provided should be tailored to the risk level.
Valid. The explanation provided by a system should accurately reflect
the factors and the influences that led
to a particular decision, and should be meaningful for the particular
customization based on purpose, target, and level of risk. While
approximation and simplification may be necessary for the system to
succeed based on the explanatory purpose and target of the explanation,
or to account for the risk of fraud or other concerns related to
revealing decision-making information, such simplifications should be
done in a scientifically supportable way. Where appropriate based on the
explanatory system, error ranges for the explanation should be
calculated and included in the explanation, with the choice of
presentation of such information balanced with usability and overall
interface complexity concerns.
Demonstrate protections for notice and explanation
Reporting. Summary reporting should document the determinations made
based on the above consider -
ations, including: the responsible entities for accountability purposes;
the goal and use cases for the system, identified users, and impacted
populations; the assessment of notice clarity and timeliness; the
assessment of the explanation's validity and accessibility; the
assessment of the level of risk; and the account and assessment of how
explanations are tailored, including to the purpose, the recipient of
the explanation, and the level of risk. Individualized profile
information should be made readily available to the greatest extent
possible that includes explanations for any system impacts or
inferences. Reporting should be provided in a clear plain language and
machine-readable manner.
44
- >-
25 MP-2.3-002 Review and document accuracy, representativeness,
relevance, suitability of data
used at different stages of AI life cycle. Harmful Bias and
Homogenization ;
Intellectual Property
MP-2.3-003 Deploy and document fact -checking techniques to verify the
accuracy and
veracity of information generated by GAI systems, especially when the
information comes from multiple (or unknown) sources. Information
Integrity
MP-2.3-004 Develop and implement testing techniques to identify GAI
produced content (e.g., synthetic media) that might be indistinguishable
from human -generated content. Information Integrity
MP-2.3-005 Implement plans for GAI systems to undergo regular
adversarial testing to identify
vulnerabilities and potential manipulation or misuse. Information
Security
AI Actor Tasks: AI Development, Domain Experts, TEVV
MAP 3.4: Processes for operator and practitioner proficiency with AI
system performance and trustworthiness – and relevant
technical standards and certifications – are defined, assessed, and
documented.
Action ID Suggested Action GAI Risks
MP-3.4-001 Evaluate whether GAI operators and end -users can accurately
understand
content lineage and origin. Human -AI Configuration ;
Information Integrity
MP-3.4-002 Adapt existing training programs to include modules on
digital content
transparency. Information Integrity
MP-3.4-003 Develop certification programs that test proficiency in
managing GAI risks and
interpreting content provenance, relevant to specific industry and
context. Information Integrity
MP-3.4-004 Delineate human proficiency tests from tests of GAI
capabilities. Human -AI Configuration
MP-3.4-005 Implement systems to continually monitor and track the
outcomes of human- GAI
configurations for future refinement and improvements . Human -AI
Configuration ;
Information Integrity
MP-3.4-006 Involve the end -users, practitioners, and operators in GAI
system in prototyping
and testing activities. Make sure these tests cover various scenarios ,
such as crisis
situations or ethically sensitive contexts. Human -AI Configuration ;
Information Integrity ; Harmful Bias
and Homogenization ; Dangerous ,
Violent, or Hateful Content
AI Actor Tasks: AI Design, AI Development, Domain Experts, End -Users,
Human Factors, Operation and Monitoring
- >-
65. See, e.g., Scott Ikeda. Major Data Broker Exposes 235 Million Social
Media Profiles in Data Lead: Info
Appears to Have Been Scraped Without Permission. CPO Magazine. Aug. 28,
2020. https://
www.cpomagazine.com/cyber-security/major-data-broker-exposes-235-million-social-media-profiles-
in-data-leak/; Lily Hay Newman. 1.2 Billion Records Found Exposed Online
in a Single Server . WIRED,
Nov. 22, 2019.
https://www.wired.com/story/billion-records-exposed-online/
66.Lola Fadulu. Facial Recognition Technology in Public Housing Prompts
Backlash . New York Times.
Sept. 24, 2019.
https://www.nytimes.com/2019/09/24/us/politics/facial-recognition-technology-housing.html
67. Jo Constantz. ‘They Were Spying On Us’: Amazon, Walmart, Use
Surveillance Technology to Bust
Unions. Newsweek. Dec. 13, 2021.
https://www.newsweek.com/they-were-spying-us-amazon-walmart-use-surveillance-technology-bust-
unions-1658603
68. See, e.g., enforcement actions by the FTC against the photo storage
app Everalbaum
(https://www.ftc.gov/legal-library/browse/cases-proceedings/192-3172-everalbum-inc-matter),
and
against Weight Watchers and their subsidiary
Kurbo(https://www.ftc.gov/legal-library/browse/cases-proceedings/1923228-weight-watchersww)
69. See, e.g., HIPAA, Pub. L 104-191 (1996); Fair Debt Collection
Practices Act (FDCPA), Pub. L. 95-109
(1977); Family Educational Rights and Privacy Act (FERPA) (20 U.S.C. §
1232g), Children's Online
Privacy Protection Act of 1998, 15 U.S.C. 6501–6505, and Confidential
Information Protection andStatistical Efficiency Act (CIPSEA) (116 Stat.
2899)
70. Marshall Allen. You Snooze, You Lose: Insurers Make The Old Adage
Literally True . ProPublica. Nov.
21, 2018.
https://www.propublica.org/article/you-snooze-you-lose-insurers-make-the-old-adage-literally-true
71.Charles Duhigg. How Companies Learn Your Secrets. The New York Times.
Feb. 16, 2012.
https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html72. Jack
Gillum and Jeff Kao. Aggression Detectors: The Unproven, Invasive
Surveillance Technology
Schools are Using to Monitor Students. ProPublica. Jun. 25, 2019.
https://features.propublica.org/aggression-detector/the-unproven-invasive-surveillance-technology-
schools-are-using-to-monitor-students/
73.Drew Harwell. Cheating-detection companies made millions during the
pandemic. Now students are
fighting back. Washington Post. Nov. 12, 2020.
https://www.washingtonpost.com/technology/2020/11/12/test-monitoring-student-revolt/
74. See, e.g., Heather Morrison. Virtual Testing Puts Disabled Students
at a Disadvantage. Government
Technology. May 24, 2022.
https://www.govtech.com/education/k-12/virtual-testing-puts-disabled-students-at-a-disadvantage;
Lydia X. Z. Brown, Ridhi Shetty, Matt Scherer, and Andrew Crawford.
Ableism And Disability
Discrimination In New Surveillance Technologies: How new surveillance
technologies in education,
policing, health care, and the workplace disproportionately harm
disabled people . Center for Democracy
and Technology Report. May 24,
2022.https://cdt.org/insights/ableism-and-disability-discrimination-in-new-surveillance-technologies-how-new-surveillance-technologies-in-education-policing-health-care-and-the-workplace-disproportionately-harm-disabled-people/
69
model-index:
- name: SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: Unknown
type: unknown
metrics:
- type: cosine_accuracy@1
value: 0.71875
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.921875
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.96875
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 1
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.71875
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.30729166666666663
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.19374999999999998
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.09999999999999999
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.71875
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.921875
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.96875
name: Cosine Recall@5
- type: cosine_recall@10
value: 1
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.8727659974381962
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.8304687500000002
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.8304687500000001
name: Cosine Map@100
- type: dot_accuracy@1
value: 0.734375
name: Dot Accuracy@1
- type: dot_accuracy@3
value: 0.921875
name: Dot Accuracy@3
- type: dot_accuracy@5
value: 0.96875
name: Dot Accuracy@5
- type: dot_accuracy@10
value: 1
name: Dot Accuracy@10
- type: dot_precision@1
value: 0.734375
name: Dot Precision@1
- type: dot_precision@3
value: 0.30729166666666663
name: Dot Precision@3
- type: dot_precision@5
value: 0.19374999999999998
name: Dot Precision@5
- type: dot_precision@10
value: 0.09999999999999999
name: Dot Precision@10
- type: dot_recall@1
value: 0.734375
name: Dot Recall@1
- type: dot_recall@3
value: 0.921875
name: Dot Recall@3
- type: dot_recall@5
value: 0.96875
name: Dot Recall@5
- type: dot_recall@10
value: 1
name: Dot Recall@10
- type: dot_ndcg@10
value: 0.8785327200386421
name: Dot Ndcg@10
- type: dot_mrr@10
value: 0.8382812500000002
name: Dot Mrr@10
- type: dot_map@100
value: 0.8382812500000001
name: Dot Map@100
SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-large-en-v1.5
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What are some of the legal frameworks mentioned in the context that aim to protect personal information, and how do they relate to data privacy concerns?',
"65. See, e.g., Scott Ikeda. Major Data Broker Exposes 235 Million Social Media Profiles in Data Lead: Info\nAppears to Have Been Scraped Without Permission. CPO Magazine. Aug. 28, 2020. https://\nwww.cpomagazine.com/cyber-security/major-data-broker-exposes-235-million-social-media-profiles-\nin-data-leak/; Lily Hay Newman. 1.2 Billion Records Found Exposed Online in a Single Server . WIRED,\nNov. 22, 2019. https://www.wired.com/story/billion-records-exposed-online/\n66.Lola Fadulu. Facial Recognition Technology in Public Housing Prompts Backlash . New York Times.\nSept. 24, 2019.\nhttps://www.nytimes.com/2019/09/24/us/politics/facial-recognition-technology-housing.html\n67. Jo Constantz. ‘They Were Spying On Us’: Amazon, Walmart, Use Surveillance Technology to Bust\nUnions. Newsweek. Dec. 13, 2021.\nhttps://www.newsweek.com/they-were-spying-us-amazon-walmart-use-surveillance-technology-bust-\nunions-1658603\n68. See, e.g., enforcement actions by the FTC against the photo storage app Everalbaum\n(https://www.ftc.gov/legal-library/browse/cases-proceedings/192-3172-everalbum-inc-matter), and\nagainst Weight Watchers and their subsidiary Kurbo(https://www.ftc.gov/legal-library/browse/cases-proceedings/1923228-weight-watchersww)\n69. See, e.g., HIPAA, Pub. L 104-191 (1996); Fair Debt Collection Practices Act (FDCPA), Pub. L. 95-109\n(1977); Family Educational Rights and Privacy Act (FERPA) (20 U.S.C. § 1232g), Children's Online\nPrivacy Protection Act of 1998, 15 U.S.C. 6501–6505, and Confidential Information Protection andStatistical Efficiency Act (CIPSEA) (116 Stat. 2899)\n70. Marshall Allen. You Snooze, You Lose: Insurers Make The Old Adage Literally True . ProPublica. Nov.\n21, 2018.\nhttps://www.propublica.org/article/you-snooze-you-lose-insurers-make-the-old-adage-literally-true\n71.Charles Duhigg. How Companies Learn Your Secrets. The New York Times. Feb. 16, 2012.\nhttps://www.nytimes.com/2012/02/19/magazine/shopping-habits.html72. Jack Gillum and Jeff Kao. Aggression Detectors: The Unproven, Invasive Surveillance Technology\nSchools are Using to Monitor Students. ProPublica. Jun. 25, 2019.\nhttps://features.propublica.org/aggression-detector/the-unproven-invasive-surveillance-technology-\nschools-are-using-to-monitor-students/\n73.Drew Harwell. Cheating-detection companies made millions during the pandemic. Now students are\nfighting back. Washington Post. Nov. 12, 2020.\nhttps://www.washingtonpost.com/technology/2020/11/12/test-monitoring-student-revolt/\n74. See, e.g., Heather Morrison. Virtual Testing Puts Disabled Students at a Disadvantage. Government\nTechnology. May 24, 2022.\nhttps://www.govtech.com/education/k-12/virtual-testing-puts-disabled-students-at-a-disadvantage;\nLydia X. Z. Brown, Ridhi Shetty, Matt Scherer, and Andrew Crawford. Ableism And Disability\nDiscrimination In New Surveillance Technologies: How new surveillance technologies in education,\npolicing, health care, and the workplace disproportionately harm disabled people . Center for Democracy\nand Technology Report. May 24, 2022.https://cdt.org/insights/ableism-and-disability-discrimination-in-new-surveillance-technologies-how-new-surveillance-technologies-in-education-policing-health-care-and-the-workplace-disproportionately-harm-disabled-people/\n69",
'25 MP-2.3-002 Review and document accuracy, representativeness, relevance, suitability of data \nused at different stages of AI life cycle. Harmful Bias and Homogenization ; \nIntellectual Property \nMP-2.3-003 Deploy and document fact -checking techniques to verify the accuracy and \nveracity of information generated by GAI systems, especially when the \ninformation comes from multiple (or unknown) sources. Information Integrity \nMP-2.3-004 Develop and implement testing techniques to identify GAI produced content (e.g., synthetic media) that might be indistinguishable from human -generated content. Information Integrity \nMP-2.3-005 Implement plans for GAI systems to undergo regular adversarial testing to identify \nvulnerabilities and potential manipulation or misuse. Information Security \nAI Actor Tasks: AI Development, Domain Experts, TEVV \n \nMAP 3.4: Processes for operator and practitioner proficiency with AI system performance and trustworthiness – and relevant \ntechnical standards and certifications – are defined, assessed, and documented. \nAction ID Suggested Action GAI Risks \nMP-3.4-001 Evaluate whether GAI operators and end -users can accurately understand \ncontent lineage and origin. Human -AI Configuration ; \nInformation Integrity \nMP-3.4-002 Adapt existing training programs to include modules on digital content \ntransparency. Information Integrity \nMP-3.4-003 Develop certification programs that test proficiency in managing GAI risks and \ninterpreting content provenance, relevant to specific industry and context. Information Integrity \nMP-3.4-004 Delineate human proficiency tests from tests of GAI capabilities. Human -AI Configuration \nMP-3.4-005 Implement systems to continually monitor and track the outcomes of human- GAI \nconfigurations for future refinement and improvements . Human -AI Configuration ; \nInformation Integrity \nMP-3.4-006 Involve the end -users, practitioners, and operators in GAI system in prototyping \nand testing activities. Make sure these tests cover various scenarios , such as crisis \nsituations or ethically sensitive contexts. Human -AI Configuration ; \nInformation Integrity ; Harmful Bias \nand Homogenization ; Dangerous , \nViolent, or Hateful Content \nAI Actor Tasks: AI Design, AI Development, Domain Experts, End -Users, Human Factors, Operation and Monitoring',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.7188 |
cosine_accuracy@3 | 0.9219 |
cosine_accuracy@5 | 0.9688 |
cosine_accuracy@10 | 1.0 |
cosine_precision@1 | 0.7188 |
cosine_precision@3 | 0.3073 |
cosine_precision@5 | 0.1937 |
cosine_precision@10 | 0.1 |
cosine_recall@1 | 0.7188 |
cosine_recall@3 | 0.9219 |
cosine_recall@5 | 0.9688 |
cosine_recall@10 | 1.0 |
cosine_ndcg@10 | 0.8728 |
cosine_mrr@10 | 0.8305 |
cosine_map@100 | 0.8305 |
dot_accuracy@1 | 0.7344 |
dot_accuracy@3 | 0.9219 |
dot_accuracy@5 | 0.9688 |
dot_accuracy@10 | 1.0 |
dot_precision@1 | 0.7344 |
dot_precision@3 | 0.3073 |
dot_precision@5 | 0.1937 |
dot_precision@10 | 0.1 |
dot_recall@1 | 0.7344 |
dot_recall@3 | 0.9219 |
dot_recall@5 | 0.9688 |
dot_recall@10 | 1.0 |
dot_ndcg@10 | 0.8785 |
dot_mrr@10 | 0.8383 |
dot_map@100 | 0.8383 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 586 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 586 samples:
sentence_0 sentence_1 type string string details - min: 20 tokens
- mean: 35.95 tokens
- max: 60 tokens
- min: 8 tokens
- mean: 545.8 tokens
- max: 1018 tokens
- Samples:
sentence_0 sentence_1 What are the primary objectives outlined in the "Blueprint for an AI Bill of Rights" as it pertains to the American people?
BLUEPRINT FOR AN
AI B ILL OF
RIGHTS
MAKING AUTOMATED
SYSTEMS WORK FOR
THE AMERICAN PEOPLE
OCTOBER 2022In what ways does the document propose to ensure that automated systems are designed and implemented to benefit society?
BLUEPRINT FOR AN
AI B ILL OF
RIGHTS
MAKING AUTOMATED
SYSTEMS WORK FOR
THE AMERICAN PEOPLE
OCTOBER 2022What is the primary purpose of the Blueprint for an AI Bill of Rights as published by the White House Office of Science and Technology Policy in October 2022?
About this Document
The Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People was
published by the White House Office of Science and Technology Policy in October 2022. This framework was
released one year after OSTP announced the launch of a process to develop “a bill of rights for an AI-powered
world.” Its release follows a year of public engagement to inform this initiative. The framework is available
online at: https://www.whitehouse.gov/ostp/ai-bill-of-rights
About the Office of Science and Technology Policy
The Office of Science and Technology Policy (OSTP) was established by the National Science and Technology
Policy, Organization, and Priorities Act of 1976 to provide the President and others within the Executive Office
of the President with advice on the scientific, engineering, and technological aspects of the economy, national
security, health, foreign relations, the environment, and the technological recovery and use of resources, among
other topics. OSTP leads interagency science and technology policy coordination efforts, assists the Office of
Management and Budget (OMB) with an annual review and analysis of Federal research and development in
budgets, and serves as a source of scientific and technological analysis and judgment for the President with
respect to major policies, plans, and programs of the Federal Government.
Legal Disclaimer
The Blueprint for an AI Bill of Rights: Making Automated Systems Work for the American People is a white paper
published by the White House Office of Science and Technology Policy. It is intended to support the
development of policies and practices that protect civil rights and promote democratic values in the building,
deployment, and governance of automated systems.
The Blueprint for an AI Bill of Rights is non-binding and does not constitute U.S. government policy. It
does not supersede, modify, or direct an interpretation of any existing statute, regulation, policy, or
international instrument. It does not constitute binding guidance for the public or Federal agencies and
therefore does not require compliance with the principles described herein. It also is not determinative of what
the U.S. government’s position will be in any international negotiation. Adoption of these principles may not
meet the requirements of existing statutes, regulations, policies, or international instruments, or the
requirements of the Federal agencies that enforce them. These principles are not intended to, and do not,
prohibit or limit any lawful activity of a government agency, including law enforcement, national security, or
intelligence activities.
The appropriate application of the principles set forth in this white paper depends significantly on the
context in which automated systems are being utilized. In some circumstances, application of these principles
in whole or in part may not be appropriate given the intended use of automated systems to achieve government
agency missions. Future sector-specific guidance will likely be necessary and important for guiding the use of
automated systems in certain settings such as AI systems used as part of school building security or automated
health diagnostic systems.
The Blueprint for an AI Bill of Rights recognizes that law enforcement activities require a balancing of
equities, for example, between the protection of sensitive law enforcement information and the principle of
notice; as such, notice may not be appropriate, or may need to be adjusted to protect sources, methods, and
other law enforcement equities. Even in contexts where these principles may not apply in whole or in part,
federal departments and agencies remain subject to judicial, privacy, and civil liberties oversight as well as
existing policies and safeguards that govern automated systems, including, for example, Executive Order 13960,
Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government (December 2020).
This white paper recognizes that national security (which includes certain law enforcement and
homeland security activities) and defense activities are of increased sensitivity and interest to our nation’s
adversaries and are often subject to special requirements, such as those governing classified information and
other protected data. Such activities require alternative, compatible safeguards through existing policies that
govern automated systems and AI, such as the Department of Defense (DOD) AI Ethical Principles and
Responsible AI Implementation Pathway and the Intelligence Community (IC) AI Ethics Principles and
Framework. The implementation of these policies to national security and defense activities can be informed by
the Blueprint for an AI Bill of Rights where feasible. - Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 5per_device_eval_batch_size
: 5num_train_epochs
: 2multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 5per_device_eval_batch_size
: 5per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 2max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | dot_map@100 |
---|---|---|
0.4237 | 50 | 0.8383 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}