<cite id="ffb66"></cite><cite id="ffb66"><track id="ffb66"></track></cite>
      <legend id="ffb66"><li id="ffb66"></li></legend>
      色婷婷久,激情色播,久久久无码专区,亚洲中文字幕av,国产成人A片,av无码免费,精品久久国产,99视频精品3
      網易首頁 > 網易號 > 正文 申請入駐

      Meta公司:DINOv3是以前所未有的規模進行視覺自我監督學習

      0
      分享至



      Meta公司網站原文,請享用,
      譚老師我看完感慨一句:性能確實很棒,但 Apache 許可證已改商業許可證了。換句話說,原來可以免費使用、修改甚至商用的 Apache 許可證被換成了需要付費或受更多限制的商業許可證,想繼續用就得按新規矩來。



      Open Source 開源

      DINOv3: Self-supervised learning for vision at unprecedented scaleDINOv3:以前所未有的規模進行視覺自我監督學習

      August 14, 2025 2025年8月14日

      Takeaways: 要點:

      • We’re introducing DINOv3, which scales self-supervised learning for images to create universal vision backbones that achieve absolute state-of-the-art performance across diverse domains, including web and satellite imagery.
        我們正在推出 DINOv3,它擴展圖像的自監督學習,以創建通用視覺主干,從而在不同領域(包括網絡和衛星圖像)實現絕對最先進的性能。
      • DINOv3 backbones produce powerful, high-resolution image features that make it easy to train lightweight adapters. This leads to exceptional performance on a broad array of downstream vision tasks, including image classification, semantic segmentation, and object tracking in video.
        DINOv3 主干可生成強大的高分辨率圖像功能,使訓練輕量級適配器變得容易。這導致了廣泛的下游視覺任務的卓越性能,包括圖像分類,語義分割和視頻中的對象跟蹤。
      • We’ve incorporated valuable community feedback, enhancing the versatility of DINOv3 by shipping smaller models that outperform comparable CLIP-based derivatives across a broad evaluation suite, as well as alternative ConvNeXt architectures for resource-constrained use cases.
        我們已經整合了寶貴的社區反饋,通過在廣泛的評估套件中提供比基于 CLIP 的衍生產品性能更好的小型模型,以及用于資源受限用例的替代 ConvNeXt 架構,增強了 DINOv3 的多功能性。
      • We’re releasing the DINOv3 training code and pre-trained backbones under a commercial license to help drive innovation and advancements in the computer vision and multimodal ecosystem.
        我們將在商業許可證下發布 DINOv 3 訓練代碼和預先訓練的骨干 ,以幫助推動計算機視覺和多模式生態系統的創新和進步。

      Self-supervised learning (SSL) —the concept that AI models can learn independently without human supervision—has emerged as the dominant paradigm in modern machine learning. It has driven the rise of large language models that acquire universal representations by pre-training on massive text corpora. However, progress in computer vision has lagged behind, as the most powerful image encoding models still rely heavily on human-generated metadata, such as web captions, for training.
      自監督學習(SSL)-AI 模型可以在沒有人類監督的情況下獨立學習的概念-已成為現代機器學習的主導范式。它推動了大型語言模型的興起,這些模型通過在大量文本語料庫上進行預訓練來獲得通用表示。然而,計算機視覺的進展卻落后了,因為最強大的圖像編碼模型仍然嚴重依賴于人類生成的元數據,例如網絡標題。

      Today, we’re releasing DINOv3, a generalist, state-of-the-art computer vision model trained with SSL that produces superior high-resolution visual features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks including object detection and semantic segmentation.
      今天,我們發布了 DINOv3,這是一個通用的、最先進的計算機視覺模型,使用 SSL 進行訓練,可以產生上級高分辨率的視覺特征。這是第一次,單一的凍結視覺骨干在多個長期存在的密集預測任務(包括對象檢測和語義分割)上的表現優于專業解決方案。



      DINOv3’s breakthrough performance is driven by innovative SSL techniques that eliminate the need for labeled data—drastically reducing the time and resources required for training and enabling us to scale training data to 1.7B images and model size to 7B parameters. This label-free approach enables applications where annotations are scarce, costly, or impossible.

      For example, our research shows that DINOv3 backbones pre-trained on satellite imagery achieve exceptional performance on downstream tasks such as canopy height estimation.
      DINOv3 的突破性性能是由創新的 SSL 技術驅動的,該技術消除了對標記數據的需求,大大減少了訓練所需的時間和資源,使我們能夠將訓練數據擴展到 1.7 B 圖像,并將模型大小擴展到 7 B 參數。這種無標簽的方法使應用程序能夠在注釋稀缺、昂貴或不可能的情況下使用。例如,我們的研究表明,在衛星圖像上預訓練的 DINOv3 骨干在下游任務(如冠層高度估計)上實現了卓越的性能。

      We believe DINOv3 will help accelerate existing use cases and also unlock new ones, leading to advancements in industries such as healthcare, environmental monitoring, autonomous vehicles, retail, and manufacturing—enabling more accurate and efficient visual understanding at scale.
      我們相信,DINOv3 將有助于加速現有的用例,并解鎖新的用例,從而推動醫療保健、環境監測、自動駕駛汽車、零售和制造等行業的進步,從而實現更準確、更高效的大規模視覺理解。

      We’re releasing DINOv3 with a comprehensive suite of open sourced backbones under a commercial license, including a satellite backbone trained on MAXAR imagery. We’re also sharing a subset of our downstream evaluation heads, enabling the community to reproduce our results and build upon them. Additionally, we’re providing sample notebooks so the community has detailed documentation to help them start building with DINOv3 today.
      我們將在商業許可下發布 DINOv3,其中包含一套全面的開源主干,包括一個在 MAXAR 圖像上訓練的衛星主干。我們還共享了下游評估負責人的子集,使社區能夠復制我們的結果并在此基礎上進行構建。此外,我們還提供了示例筆記本,以便社區擁有詳細的文檔,幫助他們立即開始使用 DINOv3 進行構建。

      Unlocking high-impact applications with self-supervised learning
      通過自我監督學習解鎖高影響力的應用程序

      DINOv3 achieves a new milestone by demonstrating, for the first time, that SSL models can outperform their weakly supervised counterparts across a wide range of tasks.

      While previous DINO models set a significant lead in dense prediction tasks, such as segmentation and monocular depth estimation, DINOv3 surpasses these accomplishments.

      Our models match or exceed the performance of the strongest recent models such as SigLIP 2 and Perception Encoder on many image classification benchmarks, and at the same time, they drastically widen the performance gap for dense prediction tasks.


      DINOv3 實現了一個新的里程碑,首次證明 SSL 模型可以在廣泛的任務中優于弱監督模型。雖然以前的 DINO 模型在密集預測任務(如分割和單目深度估計)方面取得了顯著領先,但 DINOv3 超越了這些成就。我們的模型在許多圖像分類基準測試中的性能與最近最強的模型(如 SigLIP 2 和 Perception Encoder)相匹配或超過,同時,它們大大擴大了密集預測任務的性能差距。



      DINOv3 builds on the breakthrough DINO algorithm, requiring no metadata input, consuming only a fraction of the training compute compared to prior methods, and still delivering exceptionally strong vision foundation models.

      The novel refinements introduced in DINOv3 lead to state-of-the-art performance on competitive downstream tasks such as object detection under the severe constraint of frozen weights. This eliminates the need for researchers and developers to fine-tune the model for specific tasks, enabling broader and more efficient application.
      DINOv3 建立在突破性的 DINO 算法之上,不需要元數據輸入,與以前的方法相比,只消耗一小部分訓練計算,并且仍然提供非常強大的視覺基礎模型。DINOv3 中引入的新改進導致競爭性下游任務的最新性能,例如在凍結權重的嚴格約束下的對象檢測。這消除了研究人員和開發人員為特定任務微調模型的需要,從而實現更廣泛和更有效的應用。



      Finally, because the DINO approach is not specifically tailored to any image modality, the same algorithm can be applied beyond web imagery to other domains where labeling is prohibitively difficult or expensive. DINOv2 already leverages vast amounts of unlabeled data to support diagnostic and research efforts in histology, endoscopy, and medical imaging. In satellite and aerial imagery, the overwhelming volume and complexity of data make manual labeling impractical.

      With DINOv3, we make it possible for these rich datasets to be used to train a single backbone that can then be used across satellite types, enabling general applications in environmental monitoring, urban planning, and disaster response.
      最后,由于 DINO 方法不是專門針對任何圖像模態定制的,因此相同的算法可以應用于 Web 圖像之外的其他領域,這些領域的標記非常困難或昂貴。DINOv2 已經利用大量未標記的數據來支持組織學 、 內窺鏡檢查和醫學成像方面的診斷和研究工作。在衛星和航空圖像中,數據的巨大數量和復雜性使得手動標記不切實際。通過 DINOv3,我們可以使用這些豐富的數據集來訓練單個骨干,然后可以跨衛星類型使用,從而實現環境監測,城市規劃和災害響應中的一般應用。

      DINOv3 is already having real-world impact.

      The World Resources Institute (WRI) is using our latest model to monitor deforestation and support restoration, helping local groups protect vulnerable ecosystems. WRI uses DINOv3 to analyze satellite images and detect tree loss and land-use changes in affected ecosystems. The accuracy gains from DINOv3 support automating climate finance payments by verifying restoration outcomes, reducing transaction costs, and accelerating funding to small, local groups.

      For example, compared to DINOv2, DINOv3 trained on satellite and aerial imagery reduces the average error in measuring tree canopy height in a region of Kenya from 4.1 meters to 1.2 meters. WRI is now able to scale support for thousands of farmers and conservation projects more efficiently.


      DINOv3 已經對現實世界產生了影響。 世界資源研究所 (WRI)正在使用我們的最新模型來監測森林砍伐和支持恢復,幫助當地團體保護脆弱的生態系統。世界資源研究所使用 DINOv3 分析衛星圖像,并檢測受影響生態系統中的樹木損失和土地使用變化。

      DINOv3 帶來的準確性收益通過驗證恢復結果、降低交易成本和加速向小型地方團體提供資金,支持氣候融資支付的自動化。例如,與 DINOv2 相比,在衛星和航空圖像上訓練的 DINOv3 將測量肯尼亞地區樹冠高度的平均誤差從 4.1 米降低到 1.2 米。世界資源研究所現在能夠更有效地擴大對數千名農民和保護項目的支持。

      Scalable and efficient visual modeling without fine-tuning
      可擴展且高效的可視化建模,無需微調

      We built DINOv3 by training a 7x larger model on a 12x larger dataset than its predecessor, DINOv2. To showcase the model’s versatility, we evaluate it across 15 diverse visual tasks and more than 60 benchmarks. The DINOv3 backbone particularly shines on all dense prediction tasks, showing an exceptional understanding of the scene layout and underlying physics.
      我們通過在比其前身 DINOv2 大 12 倍的數據集上訓練 7 倍大的模型來構建 DINOv3。為了展示該模型的多功能性,我們在 15 個不同的視覺任務和 60 多個基準測試中對其進行了評估。DINOv3 主干在所有密集預測任務中表現出色,表現出對場景布局和底層物理的卓越理解。

      The rich, dense features capture measurable attributes or characteristics of each pixel in an image and are represented as vectors of floating-point numbers. These features are capable of parsing objects into finer parts, even generalizing across instances and categories. This dense representation power makes it easy to train lightweight adapters with minimal annotations on top of DINOv3, meaning a few annotations and a linear model are sufficient to obtain robust dense predictions.

      Pushing things further and using a more sophisticated decoder, we show that it’s possible to achieve state-of-the-art performance on long-standing core computer vision tasks without fine-tuning the backbone.

      We show such results on object detection, semantic segmentation, and relative depth estimation.
      豐富、密集的特征捕捉圖像中每個像素的可測量屬性或特征,并表示為浮點數向量。這些功能能夠將對象解析為更精細的部分,甚至跨實例和類別進行概括。這種密集表示能力使得在 DINOv3 之上使用最少的注釋來訓練輕量級適配器變得很容易,這意味著一些注釋和線性模型就足以獲得強大的密集預測。通過進一步推進并使用更復雜的解碼器,我們證明了在無需微調主干的情況下,可以在長期的核心計算機視覺任務上實現最先進的性能。我們展示了這樣的結果,對象檢測,語義分割和相對深度估計。

      Because state-of-the-art results can be achieved without fine-tuning the backbone, a single forward pass can serve multiple applications simultaneously.

      This enables the inference cost of the backbone to be shared across tasks, which is especially critical for edge applications that often require running many predictions at once.

      DINOv3’s versatility and efficiency make it the perfect candidate for such deployment scenarios, as demonstrated by NASA’s Jet Propulsion Laboratory (JPL), which is already using DINOv2 to build exploration robots for Mars, enabling multiple vision tasks with minimal compute.
      由于無需微調主干即可實現最先進的結果,因此單個前向通道可以同時服務于多個應用。這使得骨干網的推理成本能夠在任務之間共享,這對于經常需要同時運行許多預測的邊緣應用程序尤其重要。DINOv3 的多功能性和效率使其成為此類部署場景的完美候選者,正如 NASA 噴氣推進實驗室 (JPL)所證明的那樣,該實驗室已經使用 DINOv2 為火星建造探測機器人,以最小的計算實現多個視覺任務。

      A family of deployment-friendly models一系列部署友好型模型

      Scaling DINOv3 to 7B parameters shows SSL’s full potential. However, a 7B model is impractical for many downstream applications. Following feedback from the community, we built a family of models spanning a large range of inference compute requirements to empower researchers and developers across diverse use cases.

      By distilling the ViT-7B model into smaller, high-performing variants like ViT-B and ViT-L, DINOv3 outperforms comparable CLIP-based models across a broad evaluation suite.

      Additionally, we introduce alternative ConvNeXt architectures (T, S, B, L) distilled from ViT-7B, that can accommodate varying compute constraints. We’re also releasing our distillation pipeline to enable the community to build upon this foundation.
      將 DINOv 3 參數擴展到 7 B 顯示了 SSL 的全部潛力。然而,7 B 模型對于許多下游應用是不切實際的。根據社區的反饋,我們構建了一系列涵蓋大量推理計算需求的模型,以支持研究人員和開發人員跨各種用例。通過將 ViT-7 B 模型提煉成更小的高性能變體,如 ViT-B 和 ViT-L,DINOv 3 在廣泛的評估套件中優于基于 CLIP 的同類模型。此外,我們介紹了替代 ConvNeXt 架構(T,S,B,L)從 ViT-7 B,可以適應不同的計算約束。我們還發布了我們的蒸餾管道,以使社區能夠在此基礎上再接再厲。





      聲明:個人原創,僅供參考

      特別聲明:以上內容(如有圖片或視頻亦包括在內)為自媒體平臺“網易號”用戶上傳并發布,本平臺僅提供信息存儲服務。

      Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.

      相關推薦
      熱點推薦
      新疆男籃9戰8負!管理層做出重大調整,洋帥正式離職,新帥3選1

      新疆男籃9戰8負!管理層做出重大調整,洋帥正式離職,新帥3選1

      理工男評籃球
      2026-04-05 13:06:23
      我剛從印度回來,談談一些可能讓人不太舒服的真話,句句扎心

      我剛從印度回來,談談一些可能讓人不太舒服的真話,句句扎心

      千秋文化
      2026-03-27 20:33:46
      59歲鞏俐身材引熱議,外套都快撐不住豐腴身材了,卻被夸少女體態

      59歲鞏俐身材引熱議,外套都快撐不住豐腴身材了,卻被夸少女體態

      一盅情懷
      2026-03-16 16:52:57
      內存漲瘋了!蘋果高價掃貨,安卓手機天塌了

      內存漲瘋了!蘋果高價掃貨,安卓手機天塌了

      新浪財經
      2026-04-05 22:06:36
      中方奉陪到底!沒等來日本道歉,高市卻掀桌了,突然亮出獠牙!

      中方奉陪到底!沒等來日本道歉,高市卻掀桌了,突然亮出獠牙!

      觸摸史跡
      2026-04-01 19:45:32
      湖南女子稱被前夫現任打斷5根肋骨,“拆散我的家庭還讓孩子叫她‘媽’”,女子:前夫系監獄干警,孕期出軌該同事曾被處分;警方刑事立案

      湖南女子稱被前夫現任打斷5根肋骨,“拆散我的家庭還讓孩子叫她‘媽’”,女子:前夫系監獄干警,孕期出軌該同事曾被處分;警方刑事立案

      大風新聞
      2026-04-05 19:07:02
      國家規定可以配槍的十大執法部門

      國家規定可以配槍的十大執法部門

      微法官
      2026-03-26 08:04:55
      美智庫:美軍只要36個小時,就會讓中國境內5萬個目標全都癱瘓!

      美智庫:美軍只要36個小時,就會讓中國境內5萬個目標全都癱瘓!

      共工之錨
      2026-04-04 19:33:02
      教皇利奧罕見發表強硬言論:中東沖突極其惡劣,你們雙手沾滿鮮血

      教皇利奧罕見發表強硬言論:中東沖突極其惡劣,你們雙手沾滿鮮血

      安然有思
      2026-04-01 21:31:36
      GPT-6,曝光了

      GPT-6,曝光了

      量子位
      2026-04-05 12:49:09
      英媒酸了:他們給中國“干臟活”

      英媒酸了:他們給中國“干臟活”

      觀察者網
      2026-04-05 16:10:06
      黎巴嫩真主黨稱使用巡航導彈擊中以色列軍艦

      黎巴嫩真主黨稱使用巡航導彈擊中以色列軍艦

      新華社
      2026-04-05 16:18:04
      有沒有人敢爆自己的瓜?網友:確定玩這么大嗎?

      有沒有人敢爆自己的瓜?網友:確定玩這么大嗎?

      夜深愛雜談
      2026-02-18 20:55:58
      最擅長“把爛牌打出王炸”的三個星座

      最擅長“把爛牌打出王炸”的三個星座

      星座不求人
      2026-04-05 18:32:36
      官宣!34歲奧斯卡因病正式退役 放棄6647萬薪水 中超8年賺16億

      官宣!34歲奧斯卡因病正式退役 放棄6647萬薪水 中超8年賺16億

      念洲
      2026-04-04 21:52:19
      毛澤東托曾志照顧賀子珍,曾志沉臉拒絕,說:我不給你老婆當護士

      毛澤東托曾志照顧賀子珍,曾志沉臉拒絕,說:我不給你老婆當護士

      雍親王府
      2026-03-12 14:50:07
      切爾西鎖定6000萬英鎊神級門將補強!同時暗藏自家新星替代方案

      切爾西鎖定6000萬英鎊神級門將補強!同時暗藏自家新星替代方案

      夜白侃球
      2026-04-05 21:44:53
      小寶與王某雷,誰探訪花的數量更多?

      小寶與王某雷,誰探訪花的數量更多?

      挪威森林
      2026-01-31 12:15:26
      富人的生活能有多夸張?網友:根本找不到心動還門當戶對的人

      富人的生活能有多夸張?網友:根本找不到心動還門當戶對的人

      帶你感受人間冷暖
      2026-03-27 00:05:14
      考拉悄然長大,鐘麗緹小女兒顏值驚艷,眉眼間盡是媽媽的絕世美貌

      考拉悄然長大,鐘麗緹小女兒顏值驚艷,眉眼間盡是媽媽的絕世美貌

      庭小娛
      2026-04-05 17:47:28
      2026-04-06 03:35:00
      親愛的數據 incentive-icons
      親愛的數據
      《我看見了風暴:人工智能基建革命》一書作者
      693文章數 219913關注度
      往期回顧 全部

      科技要聞

      花200薅5千算力,Claude冷血斷供“龍蝦”

      頭條要聞

      伊朗官員提開放霍爾木茲海峽條件

      頭條要聞

      伊朗官員提開放霍爾木茲海峽條件

      體育要聞

      CBA最老球員,身價7500萬美元

      娛樂要聞

      王燦兮否認婆媳不和 曬與杜淳媽合影

      財經要聞

      誰造出了優思益這頭“怪物”?

      汽車要聞

      家用SUV沒駕駛樂趣?極氪8X第一個不同意

      態度原創

      藝術
      健康
      房產
      親子
      公開課

      藝術要聞

      高210米,砸13億!廈門“礦泉水瓶大樓”即將建成!

      干細胞抗衰4大誤區,90%的人都中招

      房產要聞

      小陽春全面啟動!現房,才是這波行情里最穩的上車票

      親子要聞

      小英自曝給女兒剪短發原因!怕頭發搶營養長不高,想剃光頭太真實

      公開課

      李玫瑾:為什么性格比能力更重要?

      無障礙瀏覽 進入關懷版