aoxo
/

RealFormer

Image-to-Image

English

art

Model card Files Files and versions Community

aoxo commited on Oct 6

Commit

2aef983

•

1 Parent(s): a50d45d

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -114,7 +114,14 @@ The model was trained on [Pre-Training Dataset](https://huggingface.co/datasets/
 #### Preprocessing
-Images and their corresponding style semantic maps were resized to fit the input-output window dimensions (512 x 512). Bit depth has been recorrected to 24bit (3 channel) for images with depth greater than 24bit.
 #### Training Hyperparameters

 #### Preprocessing
+Here's a more concise version of your original paragraph, maintaining the essential information:
+---
+**Preprocessing of Large-Scale Image Data for Photorealism Enhancement**
+This section details our methodology for preprocessing a large-scale dataset of approximately 117 million game-rendered frames from 9 AAA video games and 1.24 billion real-world images from Mapillary Vistas and Cityscapes, all in 4K resolution. The goal is to pair game frames with real images that exhibit the highest cosine similarity based on structural and visual features, ensuring alignment of fine details like object positions.
+Images and their corresponding style semantic maps were resized to **512 x 512** pixels and corrected to a **24-bit** depth (3 channels) if they exceeded this depth. We employ a novel **feature-mapped channel-split PSNR matching** approach using **EfficientNet** feature extraction, channel splitting, and dual metric computation of PSNR and cosine similarity. **Locality-Sensitive Hashing** (LSH) aids in efficiently identifying the **top-10 nearest neighbors** for each frame. This resulted in a massive dataset of **1.17** billion frame-image pairs and **12.4 billion** image-frame pairs. The final selection process involves assessing similarity consistency across channels to ensure accurate pairings. This scalable preprocessing pipeline enables efficient pairing while preserving critical visual details, laying the foundation for subsequent **contrastive learning** to enhance **photorealism in game-rendered frames**.
 #### Training Hyperparameters