Spaces:

facebook
/

vggsfm

Running on Zero

App Files Files Community

JianyuanWang commited on Jun 26

Commit

9a1dda4

•

1 Parent(s): b19c7bf

update robust

Browse files

Files changed (3) hide show

app.py +3 -3
vggsfm_code/cfgs/demo.yaml +1 -1
vggsfm_code/vggsfm/utils/triangulation_helpers.py +31 -7

app.py CHANGED Viewed

@@ -205,7 +205,7 @@ with gr.Blocks() as demo:
     </ul>
     <p>If both images and videos are uploaded, the demo will only reconstruct the uploaded images. By default, we extract <strong> 1 image frame per second from the input video </strong>. To prevent crashes on the Hugging Face space, we currently limit reconstruction to the first 25 image frames. </p>
     <p>SfM methods are designed for <strong> rigid/static reconstruction </strong>. When dealing with dynamic/moving inputs, these methods may still work by focusing on the rigid parts of the scene. However, to ensure high-quality results, it is better to minimize the presence of moving objects in the input data. </p>
-    <p>The reconstruction should typically take <strong> up to 90 seconds </strong>. If it takes longer, the input data is likely not well-conditioned. </p>
     <p>If you meet any problem, feel free to create an issue in our <a href="https://github.com/facebookresearch/vggsfm" target="_blank">GitHub Repo</a> ⭐</p>
     <p>(Please note that running reconstruction on Hugging Face space is slower than on a local machine.) </p>
     </div>
@@ -215,9 +215,9 @@ with gr.Blocks() as demo:
         with gr.Column(scale=1):
             input_video = gr.Video(label="Input video", interactive=True)
             input_images = gr.File(file_count="multiple", label="Input Images", interactive=True)
-            num_query_images = gr.Slider(minimum=1, maximum=8, step=1, value=4, label="Number of query images (key frames)",
                                          info="More query images usually lead to better reconstruction at lower speeds. If the viewpoint differences between your images are minimal, you can set this value to 1. ")
-            num_query_points = gr.Slider(minimum=512, maximum=3072, step=1, value=1024, label="Number of query points",
                                          info="More query points usually lead to denser reconstruction at lower speeds.")
         with gr.Column(scale=3):

     </ul>
     <p>If both images and videos are uploaded, the demo will only reconstruct the uploaded images. By default, we extract <strong> 1 image frame per second from the input video </strong>. To prevent crashes on the Hugging Face space, we currently limit reconstruction to the first 25 image frames. </p>
     <p>SfM methods are designed for <strong> rigid/static reconstruction </strong>. When dealing with dynamic/moving inputs, these methods may still work by focusing on the rigid parts of the scene. However, to ensure high-quality results, it is better to minimize the presence of moving objects in the input data. </p>
+    <p>The reconstruction should typically take <strong> up to 90 seconds </strong>. If it takes longer, the input data is likely not well-conditioned or the query images/points are set too high. </p>
     <p>If you meet any problem, feel free to create an issue in our <a href="https://github.com/facebookresearch/vggsfm" target="_blank">GitHub Repo</a> ⭐</p>
     <p>(Please note that running reconstruction on Hugging Face space is slower than on a local machine.) </p>
     </div>
         with gr.Column(scale=1):
             input_video = gr.Video(label="Input video", interactive=True)
             input_images = gr.File(file_count="multiple", label="Input Images", interactive=True)
+            num_query_images = gr.Slider(minimum=1, maximum=10, step=1, value=4, label="Number of query images (key frames)",
                                          info="More query images usually lead to better reconstruction at lower speeds. If the viewpoint differences between your images are minimal, you can set this value to 1. ")
+            num_query_points = gr.Slider(minimum=512, maximum=4096, step=1, value=1024, label="Number of query points",
                                          info="More query points usually lead to denser reconstruction at lower speeds.")
         with gr.Column(scale=3):

vggsfm_code/cfgs/demo.yaml CHANGED Viewed

@@ -17,7 +17,7 @@ filter_invalid_frame: True
 comple_nonvis: True
 query_frame_num: 3
 robust_refine: 2
-BA_iters: 2
 low_mem: True

 comple_nonvis: True
 query_frame_num: 3
 robust_refine: 2
+BA_iters: 1
 low_mem: True

vggsfm_code/vggsfm/utils/triangulation_helpers.py CHANGED Viewed

@@ -14,7 +14,7 @@ import pycolmap
 from torch.cuda.amp import autocast
 from itertools import combinations
 def triangulate_multi_view_point_batched(
     cams_from_world, points, mask=None, compute_tri_angle=False, check_cheirality=False
@@ -44,14 +44,38 @@ def triangulate_multi_view_point_batched(
     A = torch.einsum("bnij,bnik->bjk", terms, terms)
     # Compute eigenvalues and eigenvectors
-    try:
         _, eigenvectors = torch.linalg.eigh(A)
-    except:
-        print("Meet CUSOLVER_STATUS_INVALID_VALUE ERROR during torch.linalg.eigh()")
-        print("SWITCH TO torch.linalg.eig()")
-        _, eigenvectors = torch.linalg.eig(A)
-        eigenvectors = torch.real(eigenvectors)
     # Select the first eigenvector
     first_eigenvector = eigenvectors[:, :, 0]

 from torch.cuda.amp import autocast
 from itertools import combinations
+import math
 def triangulate_multi_view_point_batched(
     cams_from_world, points, mask=None, compute_tri_angle=False, check_cheirality=False
     A = torch.einsum("bnij,bnik->bjk", terms, terms)
     # Compute eigenvalues and eigenvectors
+    num_A_batch = len(A)
+    MAX_CUSOLVER_STATUS_INVALID_VALUE = 1024000
+    if num_A_batch>MAX_CUSOLVER_STATUS_INVALID_VALUE:
+        print("A too big matrix for torch.linalg.eigh(); Meet CUSOLVER_STATUS_INVALID_VALUE; Make it happy now")
+        num_runs = math.ceil(num_A_batch/MAX_CUSOLVER_STATUS_INVALID_VALUE)
+        eigenvectors_list = []
+        for run_idx in range(num_runs):
+            start_idx = run_idx * MAX_CUSOLVER_STATUS_INVALID_VALUE
+            end_idx = (run_idx+1) * MAX_CUSOLVER_STATUS_INVALID_VALUE
+            _, eigenvectors = torch.linalg.eigh(A[start_idx:end_idx])
+            eigenvectors_list.append(eigenvectors)
+        eigenvectors = torch.cat(eigenvectors_list)
+    else:
         _, eigenvectors = torch.linalg.eigh(A)
+    # try:
+    #     _, eigenvectors = torch.linalg.eigh(A)
+    # except:
+    #     # _, eigenvectors = torch.linalg.eigh(A[len(A)//10:len(A)//3])
+    #     # for idx in
+    #     torch.linalg.eigh(A[:len(A)//3])
+    #     print("Meet CUSOLVER_STATUS_INVALID_VALUE ERROR during torch.linalg.eigh()")
+    #     print("SWITCH TO torch.linalg.eig()")
+    #     import pdb;pdb.set_trace()
+    #     _, eigenvectors = torch.linalg.eig(A)
+    #     eigenvectors = torch.real(eigenvectors)
     # Select the first eigenvector
     first_eigenvector = eigenvectors[:, :, 0]