pacsifier.cli.get_pseudonyms module
Script to get the new pseudonyms and day shifts in JSON format.
The script can be used in two modes:
de-id: use the de-ID API to get new pseudonyms and day shifts
custom: use a custom mapping file in CSV format that specifies the mapping of old / new pseudonyms
In case of the de-id mode, the script requires a PACSIFIER query file and a configuration file for the de-ID API. In case of the custom mode, the script requires a custom mapping file in CSV format.
The script saves the new pseudonyms and day shifts as JSON files in the specified output directory.
Example usage:
python get_pseudonyms.py --mode de-id --config config.json \
--queryfile query.csv --project_name PACSIFIERCohort \
--out_directory /path/to/output
python get_pseudonyms.py --mode custom --mappingfile mapping.csv \
--shift-days --project_name PACSIFIERCohort \
--out_directory /path/to/output
- pacsifier.cli.get_pseudonyms.check_config_file_deid(config_file: Dict[str, str]) None[source]
Check that the config file passed as a parameter is valid.
- Parameters:
config_file – dictionary loaded from the config json file
- pacsifier.cli.get_pseudonyms.check_queryfile_content(queryfile: str) None[source]
Check that the PACSIFIER query file is valid.
- Parameters:
queryfile – the path of the PACSIFIER query file
- pacsifier.cli.get_pseudonyms.convert_csv_to_deid_json(queryfile: str, project_name: str) Dict[str, Any][source]
Convert PACSIFIER query to json format the de-ID API can understand.
- Parameters:
queryfile – the filename of the PACSIFIER query file
project_name – the name of the project in GPCR (may or may not correspond to Kheops album)
- Returns:
JSON object suitable for the API
- pacsifier.cli.get_pseudonyms.generate_csv_with_pseudonyms_and_day_shifts(queryfile: str, pseudonyms: Dict[str, str], day_shifts: Dict[str, int], output_dir: str) None[source]
Create a CSV file with the original query file columns, new pseudonyms, and day shifts.
- Parameters:
queryfile – path to the original PACSIFIER query file
pseudonyms – dictionary mapping old Patient IDs to new pseudonyms
day_shifts – dictionary mapping old Patient IDs to day shifts
output_dir – path to save the resulting CSV file
- pacsifier.cli.get_pseudonyms.get_deid_day_shifts(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) str[source]
Run the de-ID request for day shifts and return the response as a json.
- Parameters:
deid_parameters –
dictionary containing the de-ID URL and token in the following format:
{ "deid_URL": "https://dummy.url.example", "deid_token": "1234567890" }
query_json – the PACSIFIER query formatted as a dictionary
- Returns:
JSON object containing the day shifts for each patient
- pacsifier.cli.get_pseudonyms.get_deid_pseudonyms(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) str[source]
Run the de-ID request and return the response as a json.
- Parameters:
deid_parameters –
dictionary containing the de-ID URL and token in the following format:
{ "deid_URL": "https://dummy.url.example", "deid_token": "1234567890" }
query_json – the PACSIFIER query formatted as a dictionary
- Returns:
JSON object containing the new pseudonyms for each patient
- pacsifier.cli.get_pseudonyms.get_parser() ArgumentParser[source]
Get parser for command line arguments.
- pacsifier.cli.get_pseudonyms.split_deid_query_json_in_batch(deid_query_json: Dict[str, Any], batch_size: int = 500) List[Dict[str, Any]][source]
Split the patients provided in parameter in several batch of batch_size length.
- Parameters:
deid_query_json – dictionary loaded from the deid json file
batch_size – The size of one batch
- Returns:
List of dictionaries, each containing a batch of patients with the project information