pacsifier.cli.get_pseudonyms module

Script to get the new pseudonyms and day shifts in JSON format.

The script can be used in two modes:

de-id: use the de-ID API to get new pseudonyms and day shifts
custom: use a custom mapping file in CSV format that specifies the mapping of old / new pseudonyms

In case of the de-id mode, the script requires a PACSIFIER query file and a configuration file for the de-ID API. In case of the custom mode, the script requires a custom mapping file in CSV format.

The script saves the new pseudonyms and day shifts as JSON files in the specified output directory.

Example usage:

python get_pseudonyms.py --mode de-id --config config.json \
    --queryfile query.csv --project_name PACSIFIERCohort \
    --out_directory /path/to/output
python get_pseudonyms.py --mode custom --mappingfile mapping.csv \
    --shift-days --project_name PACSIFIERCohort \
    --out_directory /path/to/output

pacsifier.cli.get_pseudonyms.check_config_file_deid(config_file: Dict[str, str]) → None[source]

Check that the config file passed as a parameter is valid.

Parameters:: config_file – dictionary loaded from the config json file

pacsifier.cli.get_pseudonyms.check_queryfile_content(queryfile: str) → None[source]

Check that the PACSIFIER query file is valid.

Parameters:: queryfile – the path of the PACSIFIER query file

pacsifier.cli.get_pseudonyms.convert_csv_to_deid_json(queryfile: str, project_name: str) → Dict[str, Any][source]

Convert PACSIFIER query to json format the de-ID API can understand.

Parameters:

queryfile – the filename of the PACSIFIER query file
project_name – the name of the project in GPCR (may or may not correspond to Kheops album)

Returns:

JSON object suitable for the API

pacsifier.cli.get_pseudonyms.generate_csv_with_pseudonyms_and_day_shifts(queryfile: str, pseudonyms: Dict[str, str], day_shifts: Dict[str, int], output_dir: str) → None[source]

Create a CSV file with the original query file columns, new pseudonyms, and day shifts.

Parameters:

queryfile – path to the original PACSIFIER query file
pseudonyms – dictionary mapping old Patient IDs to new pseudonyms
day_shifts – dictionary mapping old Patient IDs to day shifts
output_dir – path to save the resulting CSV file

pacsifier.cli.get_pseudonyms.get_deid_day_shifts(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) → str[source]

Run the de-ID request for day shifts and return the response as a json.

Parameters:

deid_parameters –

dictionary containing the de-ID URL and token in the following format:

{
    "deid_URL": "https://dummy.url.example",
    "deid_token": "1234567890"
}

query_json – the PACSIFIER query formatted as a dictionary

Returns:

JSON object containing the day shifts for each patient

pacsifier.cli.get_pseudonyms.get_deid_pseudonyms(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) → str[source]

Run the de-ID request and return the response as a json.

Parameters:

deid_parameters –

dictionary containing the de-ID URL and token in the following format:

{
    "deid_URL": "https://dummy.url.example",
    "deid_token": "1234567890"
}

query_json – the PACSIFIER query formatted as a dictionary

Returns:

JSON object containing the new pseudonyms for each patient

pacsifier.cli.get_pseudonyms.get_parser() → ArgumentParser[source]: Get parser for command line arguments.

pacsifier.cli.get_pseudonyms.main()[source]: Main function of the script.

pacsifier.cli.get_pseudonyms.split_deid_query_json_in_batch(deid_query_json: Dict[str, Any], batch_size: int = 500) → List[Dict[str, Any]][source]

Split the patients provided in parameter in several batch of batch_size length.

Parameters:

deid_query_json – dictionary loaded from the deid json file
batch_size – The size of one batch

Returns:

List of dictionaries, each containing a batch of patients with the project information