File size: 5,456 Bytes
4be33d3
14eb89b
 
4be33d3
14eb89b
4be33d3
14eb89b
 
4be33d3
3e38266
 
 
5c877e2
 
 
 
 
 
4be33d3
 
cbd3cb8
4be33d3
14eb89b
4be33d3
14eb89b
4be33d3
14eb89b
4be33d3
5c877e2
4be33d3
6a6cac8
cc00303
14eb89b
15c13c4
bf5779d
14eb89b
15c13c4
4be33d3
5c877e2
 
 
 
15c13c4
 
5c877e2
 
14eb89b
4be33d3
14eb89b
4be33d3
14eb89b
483be49
4be33d3
cbd3cb8
4be33d3
14eb89b
4be33d3
14eb89b
4be33d3
cbd3cb8
 
 
4be33d3
14eb89b
4be33d3
14eb89b
 
4be33d3
14eb89b
1370dc0
4be33d3
14eb89b
 
4be33d3
14eb89b
 
4be33d3
 
cbd3cb8
 
14eb89b
4be33d3
14eb89b
 
5aa77ac
 
14eb89b
 
 
 
 
 
 
 
e2094c4
14eb89b
4be33d3
9838eb2
4be33d3
beb5b21
4be33d3
 
 
 
 
 
 
 
 
14eb89b
 
 
 
a0624fb
14eb89b
fbfa7fb
a0624fb
 
 
 
 
 
 
 
 
 
fbfa7fb
 
14eb89b
 
 
4be33d3
15c13c4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
---
library_name: hierarchy-transformers
pipeline_tag: feature-extraction
tags:
- hierarchy-transformers
- feature-extraction
- hierarchy-encoding
- subsumption-relationships
- transformers
license: apache-2.0
language:
- en
metrics:
- precision
- recall
- f1
base_model:
- sentence-transformers/all-MiniLM-L12-v2
---

# Hierarchy-Transformers/HiT-MiniLM-L12-WordNetNoun

A **Hi**erarchy **T**ransformer Encoder (HiT) model that explicitly encodes entities according to their hierarchical relationships.

### Model Description

<!-- Provide a longer summary of what this model is. -->

HiT-MiniLM-L12-WordNet is a HiT model trained on WordNet's subsumption (hypernym) hierarchy of noun entities.

- **Developed by:** [Yuan He](https://www.yuanhe.wiki/), Zhangdie Yuan, Jiaoyan Chen, and Ian Horrocks
- **Model type:** Hierarchy Transformer Encoder (HiT) 
- **License:** Apache license 2.0
- **Hierarchy**: WordNet's subsumption (hypernym) hierarchy of noun entities.
- **Training Dataset**: [Hierarchy-Transformers/WordNetNoun](https://huggingface.co/datasets/Hierarchy-Transformers/WordNetNoun)
- **Pre-trained model:** [sentence-transformers/all-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L12-v2)
- **Training Objectives**: Jointly optimised on *Hyperbolic Clustering* and *Hyperbolic Centripetal* losses (see definitions in the [paper](https://arxiv.org/abs/2401.11374))

### Model Versions

| **Version** | **Model Revision** | **Note** |
|------------|---------|----------|
|v1.0 (Random Negatives)| `main` or `v1-random-negatives`| The variant trained on random negatives, as detailed in the [paper](https://arxiv.org/abs/2401.11374).|
|v1.0 (Hard Negatives)| `v1-hard-negatives` | The variant trained on hard negatives, as detailed in the [paper](https://arxiv.org/abs/2401.11374). |


### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** https://github.com/KRR-Oxford/HierarchyTransformers
- **Paper:** [Language Models as Hierarchy Encoders](https://arxiv.org/abs/2401.11374)

## Usage

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

HiT models are used to encode entities (presented as texts) and predict their hierarhical relationships in hyperbolic space. 

### Get Started

Install `hierarchy_transformers` (check our [repository](https://github.com/KRR-Oxford/HierarchyTransformers)) through `pip` or `GitHub`.

Use the code below to get started with the model.

```python
from hierarchy_transformers import HierarchyTransformer

# load the model
model = HierarchyTransformer.from_pretrained('Hierarchy-Transformers/HiT-MiniLM-L12-WordNetNoun')

# entity names to be encoded.
entity_names = ["computer", "personal computer", "fruit", "berry"]

# get the entity embeddings
entity_embeddings = model.encode(entity_names)
```

### Default Probing for Subsumption Prediction

Use the entity embeddings to predict the subsumption relationships between them.

```python
# suppose we want to compare "personal computer" and "computer", "berry" and "fruit"
child_entity_embeddings = model.encode(["personal computer", "berry"], convert_to_tensor=True)
parent_entity_embeddings = model.encode(["computer", "fruit"], convert_to_tensor=True)

# compute the hyperbolic distances and norms of entity embeddings
dists = model.manifold.dist(child_entity_embeddings, parent_entity_embeddings)
child_norms = model.manifold.dist0(child_entity_embeddings)
parent_norms = model.manifold.dist0(parent_entity_embeddings)

# use the empirical function for subsumption prediction proposed in the paper
# `centri_score_weight` and the overall threshold are determined on the validation set
subsumption_scores = - (dists + centri_score_weight * (parent_norms - child_norms))
```

### Train Your Own Models

Use the example scripts in our [repository](https://github.com/KRR-Oxford/HierarchyTransformers/tree/main/scripts) to reproduce existing models and train/evaluate your own models. 

## Full Model Architecture
```
HierarchyTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False})
)
```

## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

*Yuan He, Zhangdie Yuan, Jiaoyan Chen, Ian Horrocks.* **Language Models as Hierarchy Encoders.** Advances in Neural Information Processing Systems 37 (NeurIPS 2024).

```
@inproceedings{NEURIPS2024_1a970a3e,
 author = {He, Yuan and Yuan, Moy and Chen, Jiaoyan and Horrocks, Ian},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang},
 pages = {14690--14711},
 publisher = {Curran Associates, Inc.},
 title = {Language Models as Hierarchy Encoders},
 url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/1a970a3e62ac31c76ec3cea3a9f68fdf-Paper-Conference.pdf},
 volume = {37},
 year = {2024}
}
```


## Model Card Contact

For any queries or feedback, please contact Yuan He (`yuan.he(at)cs.ox.ac.uk`).