casscale.blogg.se

Google photos duplicate detection
Google photos duplicate detection






In the Google Cloud console, I created a project and enabled the Photos Library API for it, then created an OAuth client ID that I could use in my application. Using the public API is fairly straightforward the most difficult part is probably in simply setting something up to be able to authenticate with my own credentials and access my photos.

google photos duplicate detection

With some ideas in mind, I wrote a little Python application to gather data. It did suggest an approach I could take to do this myself though, by using the public Google Photos API. I did discover Rémi Mikel’s duplicate finding tool which seemed like a step in the right direction, but seemed to perform too badly to be very useful (probably because there are thousands of images that I want to get rid of, and it tries to display all of them in your browser). They’re trying to sell me something that doesn’t do what I want.The official duplicate finding tool doesn’t appear to look at visual similarity, so misses the thumbnails I want to remove.I don’t have duplicates on my local computer, and want to clean up the duplicates in the Google library so it’s easier to browse (and to save some space).Doing a web search for “Google Photos remove duplicates” mostly gives back content farm-style articles that talk about how Photos provides some duplicate-finding features (yes, but they’re limited) and usually go on to suggest some nonfree application for finding duplicates on your local computer.

google photos duplicate detection

There aren’t many existing options for managing duplicate items like this. I have previously ignored the duplicates in alternate formats because they weren’t too annoying, but when the sync tool uploaded a few thousand duplicate thumbnails I felt the need to do something about it.

google photos duplicate detection

While Google’s sync tool does know how to avoid uploading exact duplicates of photos, it doesn’t do any similarity matching on image content, so thumbnails (with the same content but lower resolution) and alternate formats (sidecar JPEGs that go along with camera raw files) end up duplicated in the Google Photos library.

google photos duplicate detection

It seems this wasn’t a problem with the old “ Backup and Sync” application because it supported excluding some files from backup (so I could have it ignore the directory that thumbnails get put into), but the new Drive application lacks such a feature. I recently had a bit of a problem with the files that had ended up in Google Photos on my account: the Google Drive desktop synchronization app seemed to have noticed the many (reasonably-high-resolution) thumbnails that my local photo management application (Lightroom) creates, and had uploaded many near-duplicate images. Managing Google Photos duplicates with Python 30 April, 2022








Google photos duplicate detection