Extract user profile attributes from an Azure ADB2C tenant using the Microsoft Graph API

Photo by Markus Spiske on Unsplash

I had to retrieve a list of users from an Azure Active Directory B2C instance today. I thought I could just go through the Azure UI but that's limited to short pages of data and limited attributes.

There is a CSV export provided on the UI but you won't get the required identity objects in the csv output if you need a user's signin email address.

I had to use the Microsoft Graph Api to get what I needed. This is a bit hacky but it does the trick!

Authenticating with the Graph API

The Microsoft Graph API is the recommended way to work with users in the Office 365 suite of products. Azure AD B2C uses the same API. To work with users efficiently you have to use Graph API.

The Graph API needs an access token for an application in your tenant with permission to work with users. If you're trying to retrieve a list of users then you probably already have an application. You'll need to set some things up in Azure Portal for the application so we can authorize ourselves.

You need to give the application a few permissions in Microsoft Graph - Users.ReadWrite.All, openid and offline_access. Next create a client secret for that application in the "Certificates and secrets" section. The tenantID and ClientId both come from the application overview.

Azure AD B2C user profile attributes in Graph API

The Azure AD B2C user profiles have attributes we can retrieve that are not in the default response object. We have to specify them in our query. In particular we want the identities. A single account on Azure AD B2C can have multiple methods of signing in. An identity represents a single method of logging in for an account.

The user signup flow creates an identity with the emailAddress type.

You can see the full specification with all the other possible properties here

Paging results in Azure AD B2C Graph API

There is a default limit of 20 results in the graph api responses. We can set the top query parameter to return a maximum of 999 results. If you have more results than that you will have to page through your users 999 at a time.

To help page results the Graph API returns a cursor url representing the next set of data. This is in the @odata.nextLink property of the response. If there is no property it means you've reached the end of the results.

The script

Once you have all your configuration data you can write your script. Fill in the configurationSettings below with your application's values.

/* eslint-disable @typescript-eslint/naming-convention */
import fs from "fs";
import axios from "axios";

// replace these with your own values
// most of them can be found on the Azure ADB2C UI
// You'll have to create a client secret
const configurationSettings = {
  tenantId: "11...",
  clientId: "11...",
  scope: "https%3A%2F%2Fgraph.microsoft.com%2F.default",
  clientSecret: "11..",
  pathToSaveResults: "./allUserEmailRecords.json",
};

// This is the url you use to get an access token from
const graphAccessUrl =
  "https://login.microsoftonline.com/" +
  configurationSettings.tenantId +
  "/oauth2/v2.0/token";
const graphTokenBody =
  "client_id=" +
  configurationSettings.clientId +
  "&scope=" +
  configurationSettings.scope +
  "&client_secret=" +
  configurationSettings.clientSecret +
  "&grant_type=client_credentials";
// This is the graph api url we will use to retrieve users.
// We ask for the maximum page size of 999 and we limit the result set to the data we want
// - the id and the identities object.
const graphUsersUrl =
  "https://graph.microsoft.com/V1.0/users?$top=999&$select=id,identities";

// eslint-disable-next-line @typescript-eslint/no-floating-promises
(async () => {
  try {
    const tokenResponse = await axios.post(graphAccessUrl, graphTokenBody);
    const token = tokenResponse.data?.access_token as string;
    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    let allMappedUsers: any[] = [];

    // Here we get the first set of results.
    let graphApiResponse = await axios.get(graphUsersUrl, {
      headers: {
        Authorization: "Bearer " + token,
        Accept: "application/json",
      },
    });

    // map this response in to the set to return later
    allMappedUsers = allMappedUsers.concat(
      // eslint-disable-next-line @typescript-eslint/no-explicit-any
      mapUserEmails(graphApiResponse.data.value as Array<any>)
    );

    // Now we check for any pages and we get each page and map it into the
    // full result set if found.
    while (graphApiResponse.data["@odata.nextLink"] !== undefined) {
      const url = graphApiResponse.data["@odata.nextLink"];
      graphApiResponse = await axios.get(url, {
        headers: {
          Authorization: "Bearer " + token,
          Accept: "application/json",
        },
      });
      console.log("mapping another page...");
      allMappedUsers = allMappedUsers.concat(
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        mapUserEmails(graphApiResponse.data.value as Array<any>)
      );
    }

    fs.writeFileSync(
      configurationSettings.pathToSaveResults,
      JSON.stringify(allMappedUsers)
    );
  } catch (error) {
    console.error(error);
  }
})();

// There are multiple identities. We want to use the one that
// is of type "emailAddress"
// eslint-disable-next-line @typescript-eslint/no-explicit-any
function mapUserEmails(userData: Array<any>) {
  return userData.map((userInstance) => {
    return {
      userId: userInstance.id,
      userEmail: (
        userInstance.identities as Array<{
          signInType: string;
          issuerAssignedId: string;
        }>
      ).find((userIdentity) => userIdentity.signInType === "emailAddress")
        ?.issuerAssignedId,
    };
  });
}

Let me know if you have any comments or questions!