--- tags: - sentence-transformers - feature-extraction - sentence-similarity license: apache-2.0 datasets: - wikimedia/wikipedia - SiberiaSoft/SiberianPersonaChat-2 language: - ru - en metrics: - mse library_name: transformers --- # FractalGPT/SbertSVDDistil This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search. ## Usage (Sentence-Transformers) Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed: * [Run example in Collab](https://colab.research.google.com/drive/1R9hHbEpyGEYO5Nw3p5VWTc-bny3PqiZs?hl) ``` pip install -U sentence-transformers -q ``` Then you can use the model like this: ```python from transformers import BertModel import numpy as np import torch from torch import nn from sentence_transformers import SentenceTransformer, util ``` ```python class SVDLinearLayer(nn.Module): def __init__(self, in_features, out_features, h_dim): super(SVDLinearLayer, self).__init__() self.encoder = nn.Linear(in_features, h_dim, bias=False) self.decoder = nn.Linear(h_dim, out_features, bias=True) def forward(self, x): x = self.encoder(x) x = self.decoder(x) return x class SVDBertModel(BertModel): def __init__(self, config): super(SVDBertModel, self).__init__(config) for i, layer in enumerate(self.encoder.layer): intermediate_size = layer.intermediate.dense.out_features output_size = layer.output.dense.out_features if i > 0: layer.intermediate.dense = SVDLinearLayer(layer.intermediate.dense.in_features, intermediate_size, 5) layer.output.dense = SVDLinearLayer(layer.output.dense.in_features, output_size, 5) else: layer.intermediate.dense = nn.Linear(layer.intermediate.dense.in_features, intermediate_size, True) layer.output.dense = nn.Linear(layer.output.dense.in_features, output_size, True) def sim(texts_1, texts_2): embedding_1 = model.encode(texts_1) embedding_2 = model.encode(texts_2) s = util.pytorch_cos_sim(embedding_1, embedding_2) return s.detach().numpy() ``` ```python path = 'FractalGPT/SbertSVDDistil' model = SentenceTransformer(path) model[0].auto_model = SVDBertModel.from_pretrained(path) # Правильная загрузка слоев с SVD ``` ```python sim(["I'm happy", "Transistor (English transistor, an acronym invented in 1947 - from the English transfer + English resistor [1] - for a device for passing current through a resistance), semiconductor triode - an electronic component made of semiconductor material, capable of controlling a significant current into the output with a small input signal circuits, which allows it to be used to amplify, generate, switch and convert electrical signals. Currently, the transistor is the basis of the circuit design of the vast majority of electronic devices and integrated circuits.", "That is a happy dog", "Today is a sunny day", "An electric vacuum triode, or simply triode, is an electronic tube that allows an input signal to control the current in an electrical circuit. It has three electrodes: a thermionic cathode (direct or indirectly heated), an anode and one control grid."], ["Я счастлив", "Транзи́стор (англ. transistor, придуманный в 1947 году акроним — от англ. transfer + англ. resistor[1] — для устройства пропуска тока через сопротивление), полупроводнико́вый трио́д — электронный компонент из полупроводникового материала, способный небольшим входным сигналом управлять значительным током в выходной цепи, что позволяет использовать его для усиления, генерирования, коммутации и преобразования электрических сигналов. В настоящее время транзистор является основой схемотехники подавляющего большинства электронных устройств и интегральных микросхем.", "Это счастливая собака", "Сегодня солнечный день", "Эле́ктрова́куумный трио́д, или просто трио́д, — электронная лампа, позволяющая входным сигналом управлять током в электрической цепи. Имеет три электрода: термоэлектронный катод (прямого или косвенного накала), анод и одну управляющую сетку."]) ``` ``` array([[ 0.92624545, -0.1081745 , 0.5569258 , 0.4006917 , 0.0524814 ], [-0.10137352, 0.9214004 , -0.0590867 , -0.05579955, 0.6043041 ], [ 0.56128216, -0.08206842, 0.9496383 , 0.23291808, 0.03726077], [ 0.34002465, -0.05840789, 0.240945 , 0.9276679 , 0.09676868], [-0.01571994, 0.60077745, -0.00638374, -0.02819303, 0.8434113 ]], dtype=float32) ``` ## Training * Base model [FractalGPT/SbertDistil](https://huggingface.co./FractalGPT/SbertDistil). * Log of additional training after decomposition. ## Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: SVDBertModel (1): Pooling({'word_embedding_dimension': 312, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False}) (2): Dense({'in_features': 312, 'out_features': 384, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'}) ) ```