AI Secure

DecodingTrust Public

A Comprehensive Assessment of Trustworthiness in GPT Models

Python 314 61

AgentPoison Public

[NeurIPS 2024] Official implementation for "AgentPoison: Red-teaming LLM Agents via Memory or Knowledge Base Backdoor Poisoning"

Python 204 27

DBA Public

DBA: Distributed Backdoor Attacks against Federated Learning (ICLR 2020)

Python 203 48

Certified-Robustness-SoK-Oldver Public

This repo keeps track of popular provable training and verification approaches towards robust neural networks, including leaderboards on popular datasets and paper categorization.

98 10

VeriGauge Public

A united toolbox for running major robustness verification approaches for DNNs. [S&P 2023]

C 90 7

InfoBERT Public

[ICLR 2021] "InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective" by Boxin Wang, Shuohang Wang, Yu Cheng, Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Python 85 8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Secure

Popular repositories Loading

Repositories

Uh oh!

People

Top languages

Most used topics