Google Photos: Get and upload photos with API
As an Android phone owner, I am making now more photos with my phone camera than with my 10-year-old DSLR camera. Massive amounts of my pictures and videos are stored in my Google Cloud.
While Google makes an excellent job of making these photos accessible and searchable using the web interface, I also want to have all my photo details as a data file I can process myself.
In order to get this data my first thing to try is to use Google Photo API. I will use Python and Jupyter Notebook running locally on my computer.
Google API OAuth authentication
The most difficult part for me was to figure out how to make Google API authentication work. Compared to other methods like API Key & Secret or password, OAuth is more complex, since it requires a number of requests sent between client and server and asks the end user to allow the application to access specific data from your Google account.
The first step to get this done is to create a new project using Google Cloud Console.
The next thing you need to set up is the “OAuth consent screen”. For external use, Google does more rigorous checks, so choosing the user type “Internal” will allow you to add a consent screen with a minimum number of data required.
Next, navigate to “Credentials” and click on “Create credentials”
Select “OAuth client ID”
In the Application type dropdown select “Desktop app”:
Enter a name for your new client ID:
A new client ID was created. Download JSON with newly created credentials.
Authenticating with Google API
Use credentials to get a temporary access token which will be used to make calls to Google Photo API.
The following imports are required:
import os
import pandas as pd
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
The Scopes array is used to list actions you allow the application to do with your data. In the following code, we request read-only access to the photo library.
Code checks if token.json already exists on your local drive. If it’s not found or no longer valid it will request a new one by making a request to Google. While doing this a Google authorisation page will appear asking you if you want to grant requested permissions for the application.
The ClientID file downloaded in the previous step should be saved to _secrets_/client_secret.json.
scopes=['https://www.googleapis.com/auth/photoslibrary.readonly']
creds = None
if os.path.exists('_secrets_/token.json'):
creds = Credentials.from_authorized_user_file('_secrets_/token.json', scopes)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'_secrets_/client_secret.json', scopes)
creds = flow.run_local_server()
print(creds)
# Save the credentials for the next run
with open('_secrets_/token.json', 'w') as token:
token.write(creds.to_json())
With credentials loaded into the creds variable, you can create AuthorizedSession, which will add your authorisation token to API request HTTP headers.
authed_session is actually using the Requests HTTP library, so check its documentation on get() and post() method parameters.
from google.auth.transport.requests import AuthorizedSession
authed_session = AuthorizedSession(creds)
Making a Google Photo API “mediaItems:search” call
Now we can call API to search for photos taken during a given date range. For example between 1st and 26th January 2023. Check mediaItems:search documentation for additional search filters available.
The maximum page size is 100, so to get all photos you need to make additional requests. nextPageToken from the API response should be added as the pageToken parameter to get the next page.
In the following code, all photo search results are added to media_items array.
nextPageToken = None
idx = 0
media_items = []
while True:
idx += 1
print(idx)
response = authed_session.post(
'https://photoslibrary.googleapis.com/v1/mediaItems:search',
headers = { 'content-type': 'application/json' },
json={
"pageSize": 100,
"pageToken": nextPageToken,
"filters": {
"dateFilter": {
"ranges": [{
"startDate": {
"year": 2023,
"month": 1,
"day": 1,
},
"endDate": {
"year": 2023,
"month": 1,
"day": 26,
}
}]
}
}
})
response_json = response.json()
media_items += response_json["mediaItems"]
if not "nextPageToken" in response_json:
break
nextPageToken = response_json["nextPageToken"]
Convert results into pandas DataFrame
Now media_items array could be converted into pandas DataFrame. It contains information about the 1985 photos that I took during that period.
I have also extracted information stored as JSON in mediaData column using json_normalize method and concatenating both tables into a single one.
The following data is available:
- Image size
- Camera model from the image EXIF data
- Exposure, FNumber and lens focal length
- time then the photo was taken
photos_df = pd.DataFrame(media_items)
photos_df = pd.concat([photos_df, pd.json_normalize(photos_df.mediaMetadata).rename(columns={"creationTime": "creationTime_metadata"})], axis=1)
photos_df["creationTime_metadata_dt"] = photos_df.creationTime_metadata.astype("datetime64")
photos_df
Here is an example of single photo details.
photos_df.iloc[25]
Unfortunately, I did not find the data I was looking for. I wanted to get the GPS coordinates of where my photos were taken. Looks like due to data privacy concerns Google decided not to make this data available using API. Yes, that’s not good, but I will show you how to get this data by using Google Takeout in my next post.
Using data
BaseUrl can be used to get actual photo content. By adding =w{width}-h{height} parameters to the end of the base URL and making an HTTP GET request you will receive photo contents as a byte array:
from IPython import display
image_data_response = authed_session.get(photos_df.baseUrl[25] + "=w500-h250")
display.Image(data=image_data_response.content)
With photo metadata available you can easily get some data visualization based on a photo taken date and time.
In which hours most photos were taken
By extracting an hour when the photo was taken you can build a histogram showing hours when you were most active taking photos. Please note that time is in the UTC time zone.
photos_df["creationTime_metadata_hour"] = photos_df.creationTime_metadata_dt.dt.hour
photos_df.creationTime_metadata_hour.value_counts().to_frame().reindex(range(0, 23)).fillna(0).plot.bar(title="sss", figsize=(10,7), legend=["AAA"], width=.9)
view rawgoogle_photo_6.py hosted with ❤ by GitHub
Number of photos taken by date
The following code shows you a number of photos taken by date:
from matplotlib.dates import DateFormatter
import matplotlib.pyplot as plt
photos_df["creationTime_metadata_date"] = photos_df.creationTime_metadata_dt.dt.date
photos_per_day = photos_df.creationTime_metadata_date.value_counts().to_frame().reindex(pd.date_range(photos_df.creationTime_metadata_date.min(), photos_df.creationTime_metadata_date.max()))
plt.figure(figsize=(10,6))
ax = plt.bar(photos_per_day.index, photos_per_day.creationTime_metadata_date, width=.9)
view rawgoogle_photo_6.py hosted with ❤ by GitHub
Uploading photo
Since earlier were used a read-only scope to access Google Photo API, we need to change the scope to https://www.googleapis.com/auth/photoslibrary.appendonly and get a new authorisation token in order to be able to append content to Google Photos.
The new token will be saved as token_append.json
scopes=['https://www.googleapis.com/auth/photoslibrary.appendonly']
creds = None
if os.path.exists('_secrets_/token_append.json'):
creds = Credentials.from_authorized_user_file('_secrets_/token_append.json', scopes)
if not creds or not creds.valid:
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file(
'_secrets_/client_secret.json', scopes)
creds = flow.run_local_server()
print(creds)
# Save the credentials for the next run
with open('_secrets_/token_append.json', 'w') as token:
token.write(creds.to_json())
from google.auth.transport.requests import AuthorizedSession
authed_session = AuthorizedSession(creds)
Now read the JPEG image file as a byte array from the local drive into image_contents and upload an image using an HTTP POST request. Response text would contain the upload token if the upload was successful.
# read image from file
with open("data/2023-01-19_Temple_of_Apollo,_Side,_Turkey_1.jpeg", "rb") as f:
image_contents = f.read()
# upload photo and get upload token
response = authed_session.post(
"https://photoslibrary.googleapis.com/v1/uploads",
headers={},
data=image_contents)
upload_token = response.text
The next step is to add an image to google photos by using “mediaItems:batchCreate”. As the endpoint suggests, multiple images could be added in a single request.
# use batch create to add photo and description
response = authed_session.post(
'https://photoslibrary.googleapis.com/v1/mediaItems:batchCreate',
headers = { 'content-type': 'application/json' },
json={
"newMediaItems": [{
"description": "Test photo",
"simpleMediaItem": {
"uploadToken": upload_token,
"fileName": "test.jpg"
}
}]
}
)
print(response.text)
If successful, the response will contain a new image ID and other details. The image can also be found in your Google Photos. Note that the project name used to create ClientID appears in the photo information “Uploaded from …” field
There is a limit on the number of requests you can make per day, but in most cases, it should be sufficient for personal use.
Conclusion
While this API can be useful for some use cases, it has limited capabilities compared to the details provided within Google Photos UI. The absence of GPS data is a major limitation for me.
On another side, it can be useful if you want to automate image uploads, search for photos or do album management.
Also, authentication flow could be used to access other APIs provided by Google, like Calendar, Google Drive and others.
In my next post, I am describing how to get complete EXIF metadata and original photos using Google Takeout.