Initial commit for AgriBrain's AI-core, agbrain

Browse files

Files changed (9) hide show

README.md +134 -0
config.json +39 -0
generation_config.json +6 -0
merges.txt +0 -0
pytorch_model.bin +3 -0
special_tokens_map.json +23 -0
tf_model.h5 +3 -0
tokenizer_config.json +33 -0
vocab.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,134 @@

+---
+library_name: transformers
+license: apache-2.0
+metrics:
+- accuracy
+pipeline_tag: text-generation
+tags:
+- text-generation-inference
+---
+# AgriBrain's AI-core, agbrain
+---
+AbriBrain's AI-core, agbrain, is a cutting-edge natural
+language processing (NLP) model built specifically for
+generating content related to agriculture. The model is a
+fine-tuned version of the popular GPT-2 language model, trained
+on a vast corpus of 1601 PDF documents sourced from various
+reputable online resources.
+Agbrain has been specifically designed to cater to the needs
+of the agriculture industry, including farmers, agronomists,
+agricultural researchers, and other stakeholders.
+One of the key strengths of Agbrain is its ability to generate
+coherent, and contextually relevant content. The model has been
+fine-tuned using advanced machine learning techniques to ensure
+that the generated content is both accurate and informative. It
+is capable of producing content on a wide range of topics,
+including crop cultivation, livestock management, pest control,
+irrigation, and more.
+Overall, Agbrain is a powerful and versatile NLP model that is
+perfectly suited to the needs of the agriculture industry.
+# Usage
+---
+## Transformers and model.generate
+---
+```python
+import tensorflow as tf
+from transformers import TFGPT2LMHeadModel, GPT2Tokenizer
+tokenizer = GPT2Tokenizer.from_pretrained("benkimz/agbrain")
+model = TFGPT2LMHeadModel.from_pretrained("benkimz/agbrain")
+prompt = """
+I think agribusiness is a great opportunity for passionate
+investors. From food business to growing crops for sale,
+and rearing livestock for business.
+"""
+input_ids = tokenizer.encode(prompt, return_tensors="tf")
+outputs = model.generate(input_ids=input_ids,
+          max_length=120,
+          do_sample=True)
+generated_text = tokenizer.decode(outputs[0],
+          skip_special_tokens=True)
+print(generated_text)
+# Output
+"""
+I think agribusiness is a great opportunity for passionate
+investors. From food business to growing crops for sale,
+and rearing livestock for business.
+In this paper I will introduce a concept model agribusiness
+that focuses on businesses to grow large amounts of product.
+ This model requires that product be sold outside of
+agriculture industry, thus allowing farmers advantages,
+especially over agronomic competition in production.
+model is very important to farmers as it will be possible,
+to sell their products at local markets without
+"""
+```
+## Transformers pipeline
+---
+```python
+from transformers import pipeline, set_seed
+generator = pipeline('text-generation', model='benkimz/agbrain')
+set_seed(42)
+samples = generator(
+    "Animal husbandry is an important part of livestock production.",
+    max_length=100,
+    num_return_sequences=2
+)
+for sample in samples:
+  print("Model output: {}\n".format(sample['generated_text']))
+# Output
+"""
+**Model output**: Animal husbandry is an important part of
+livestock production.  livestock production industry is complex,
+many factors contribute to this complexity.  need to determine
+most efficient method of handling livestock to ensure best quality
+product. It is important that animals being handled appropriately
+have properly cleaned equipment that prevents scratching
+(Sappell 2002). Because livestock is an important part of
+livestock production, veterinary care must be taken regularly
+during transport of animals from a farm to your home to be
+successful. If livestock were to be
+**Model output**: Animal husbandry is an important part of
+livestock production. Animal husbandry combines various
+strategies to control pests. Management strategies of pest
+management strategies
+Preventing pest from reaching level
+ Preventing pest from reaching level
+To minimize transmission costs, control mechanisms
+ must be developed to prevent pest from reaching level. In
+order to have an accurate information about pest
+management methods, instrumental field study of pest management
+measures be developed by field of study. A technique of this
+"""
+```
+# Metrics
+---
+Step|Training Loss
+----|---------------
+500|3.877700
+1000|3.746200
+1500|3.659600
+2000|3.613300
+2500|3.603400
+3000|3.561600
+3500|3.558300
+4000|3.518400
+4500|3.504100
+5000|3.508600

config.json ADDED Viewed

	@@ -0,0 +1,39 @@

+{
+  "_name_or_path": "./",
+  "activation_function": "gelu_new",
+  "architectures": [
+    "GPT2LMHeadModel"
+  ],
+  "attn_pdrop": 0.1,
+  "bos_token_id": 50256,
+  "embd_pdrop": 0.1,
+  "eos_token_id": 50256,
+  "initializer_range": 0.02,
+  "layer_norm_epsilon": 1e-05,
+  "model_type": "gpt2",
+  "n_ctx": 1024,
+  "n_embd": 768,
+  "n_head": 12,
+  "n_inner": null,
+  "n_layer": 12,
+  "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
+  "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
+  "scale_attn_weights": true,
+  "summary_activation": null,
+  "summary_first_dropout": 0.1,
+  "summary_proj_to_labels": true,
+  "summary_type": "cls_index",
+  "summary_use_proj": true,
+  "task_specific_params": {
+    "text-generation": {
+      "do_sample": true,
+      "max_length": 50
+    }
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.27.3",
+  "use_cache": true,
+  "vocab_size": 50257
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.27.3"
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a2c4eca867e6bfb4a911d4dcd916d8232e7559a4fc0adc6d70ec822ef4776439
+size 510398013

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "bos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tf_model.h5 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e178bfc06c22fd9171906b465eb6a86499e3cd0cf6c241a478bbfabcfd895f20
+size 497935440

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,33 @@

+{
+  "add_bos_token": false,
+  "add_prefix_space": false,
+  "bos_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
+  "errors": "replace",
+  "model_max_length": 1024,
+  "pad_token": null,
+  "special_tokens_map_file": null,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": {
+    "__type": "AddedToken",
+    "content": "<|endoftext|>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  }
+}

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff