multi-train commited on
Commit
aba2bca
1 Parent(s): 6492568

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md CHANGED
@@ -60,4 +60,20 @@ corpus_embeddings = model.encode(corpus)
60
  similarities = cosine_similarity(query_embeddings,corpus_embeddings)
61
  retrieved_doc_id = np.argmax(similarities)
62
  print(retrieved_doc_id)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  ```
 
60
  similarities = cosine_similarity(query_embeddings,corpus_embeddings)
61
  retrieved_doc_id = np.argmax(similarities)
62
  print(retrieved_doc_id)
63
+ ```
64
+
65
+ ## Clustering
66
+ Use **customized embeddings** for clustering texts in groups.
67
+ ```python
68
+ import sklearn
69
+ sentences = [['Represent the Medicine sentence for clustering; Input: ','Dynamical Scalar Degree of Freedom in Horava-Lifshitz Gravity', 0],
70
+ ['Represent the Medicine sentence for clustering; Input: ','Comparison of Atmospheric Neutrino Flux Calculations at Low Energies', 0],
71
+ ['Represent the Medicine sentence for clustering; Input: ','Fermion Bags in the Massive Gross-Neveu Model', 0],
72
+ ['Represent the Medicine sentence for clustering; Input: ',"QCD corrections to Associated t-tbar-H production at the Tevatron",0],
73
+ ['Represent the Medicine sentence for clustering; Input: ','A New Analysis of the R Measurements: Resonance Parameters of the Higher, Vector States of Charmonium',0]]
74
+ embeddings = model.encode(sentences)
75
+ clustering_model = sklearn.cluster.MiniBatchKMeans(n_clusters=2)
76
+ clustering_model.fit(embeddings)
77
+ cluster_assignment = clustering_model.labels_
78
+ print(cluster_assignment)
79
  ```