aoxo
/

Image-to-Image
English
art
aoxo commited on
Commit
2aef983
1 Parent(s): a50d45d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -1
README.md CHANGED
@@ -114,7 +114,14 @@ The model was trained on [Pre-Training Dataset](https://huggingface.co/datasets/
114
 
115
  #### Preprocessing
116
 
117
- Images and their corresponding style semantic maps were resized to fit the input-output window dimensions (512 x 512). Bit depth has been recorrected to 24bit (3 channel) for images with depth greater than 24bit.
 
 
 
 
 
 
 
118
 
119
  #### Training Hyperparameters
120
 
 
114
 
115
  #### Preprocessing
116
 
117
+ Here's a more concise version of your original paragraph, maintaining the essential information:
118
+
119
+ ---
120
+
121
+ **Preprocessing of Large-Scale Image Data for Photorealism Enhancement**
122
+ This section details our methodology for preprocessing a large-scale dataset of approximately 117 million game-rendered frames from 9 AAA video games and 1.24 billion real-world images from Mapillary Vistas and Cityscapes, all in 4K resolution. The goal is to pair game frames with real images that exhibit the highest cosine similarity based on structural and visual features, ensuring alignment of fine details like object positions.
123
+
124
+ Images and their corresponding style semantic maps were resized to **512 x 512** pixels and corrected to a **24-bit** depth (3 channels) if they exceeded this depth. We employ a novel **feature-mapped channel-split PSNR matching** approach using **EfficientNet** feature extraction, channel splitting, and dual metric computation of PSNR and cosine similarity. **Locality-Sensitive Hashing** (LSH) aids in efficiently identifying the **top-10 nearest neighbors** for each frame. This resulted in a massive dataset of **1.17** billion frame-image pairs and **12.4 billion** image-frame pairs. The final selection process involves assessing similarity consistency across channels to ensure accurate pairings. This scalable preprocessing pipeline enables efficient pairing while preserving critical visual details, laying the foundation for subsequent **contrastive learning** to enhance **photorealism in game-rendered frames**.
125
 
126
  #### Training Hyperparameters
127