pacsifier.cli.get_pseudonyms module

Script to get the new pseudonyms and day shifts in JSON format.

The script can be used in two modes:

  • de-id: use the de-ID API to get new pseudonyms and day shifts

  • custom: use a custom mapping file in CSV format that specifies the mapping of old / new pseudonyms

In case of the de-id mode, the script requires a PACSIFIER query file and a configuration file for the de-ID API. In case of the custom mode, the script requires a custom mapping file in CSV format.

The script saves the new pseudonyms and day shifts as JSON files in the specified output directory.

Example usage:

python get_pseudonyms.py --mode de-id --config config.json \
    --queryfile query.csv --project_name PACSIFIERCohort \
    --out_directory /path/to/output
python get_pseudonyms.py --mode custom --mappingfile mapping.csv \
    --shift-days --project_name PACSIFIERCohort \
    --out_directory /path/to/output
pacsifier.cli.get_pseudonyms.check_config_file_deid(config_file: Dict[str, str]) None[source]

Check that the config file passed as a parameter is valid.

Parameters:

config_file – dictionary loaded from the config json file

pacsifier.cli.get_pseudonyms.check_queryfile_content(queryfile: str) None[source]

Check that the PACSIFIER query file is valid.

Parameters:

queryfile – the path of the PACSIFIER query file

pacsifier.cli.get_pseudonyms.convert_csv_to_deid_json(queryfile: str, project_name: str) Dict[str, Any][source]

Convert PACSIFIER query to json format the de-ID API can understand.

Parameters:
  • queryfile – the filename of the PACSIFIER query file

  • project_name – the name of the project in GPCR (may or may not correspond to Kheops album)

Returns:

JSON object suitable for the API

pacsifier.cli.get_pseudonyms.generate_csv_with_pseudonyms_and_day_shifts(queryfile: str, pseudonyms: Dict[str, str], day_shifts: Dict[str, int], output_dir: str) None[source]

Create a CSV file with the original query file columns, new pseudonyms, and day shifts.

Parameters:
  • queryfile – path to the original PACSIFIER query file

  • pseudonyms – dictionary mapping old Patient IDs to new pseudonyms

  • day_shifts – dictionary mapping old Patient IDs to day shifts

  • output_dir – path to save the resulting CSV file

pacsifier.cli.get_pseudonyms.get_deid_day_shifts(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) str[source]

Run the de-ID request for day shifts and return the response as a json.

Parameters:
  • deid_parameters

    dictionary containing the de-ID URL and token in the following format:

    {
        "deid_URL": "https://dummy.url.example",
        "deid_token": "1234567890"
    }
    

  • query_json – the PACSIFIER query formatted as a dictionary

Returns:

JSON object containing the day shifts for each patient

pacsifier.cli.get_pseudonyms.get_deid_pseudonyms(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) str[source]

Run the de-ID request and return the response as a json.

Parameters:
  • deid_parameters

    dictionary containing the de-ID URL and token in the following format:

    {
        "deid_URL": "https://dummy.url.example",
        "deid_token": "1234567890"
    }
    

  • query_json – the PACSIFIER query formatted as a dictionary

Returns:

JSON object containing the new pseudonyms for each patient

pacsifier.cli.get_pseudonyms.get_parser() ArgumentParser[source]

Get parser for command line arguments.

pacsifier.cli.get_pseudonyms.main()[source]

Main function of the script.

pacsifier.cli.get_pseudonyms.split_deid_query_json_in_batch(deid_query_json: Dict[str, Any], batch_size: int = 500) List[Dict[str, Any]][source]

Split the patients provided in parameter in several batch of batch_size length.

Parameters:
  • deid_query_json – dictionary loaded from the deid json file

  • batch_size – The size of one batch

Returns:

List of dictionaries, each containing a batch of patients with the project information