`pacsifier.cli`

The pacsifier.cli subpackage contains multiple modules that define the command line interface (CLI) tools of the pacsifier package.

`pacsifier.cli.pacsifier`

Script to query, retrieve, and upload DICOM images from / to a PACS server.

pacsifier.cli.pacsifier.check_query_table_allowed_filters(table: <Mock name='mock.DataFrame' id='140068451577392'>, allowed_filters: ~typing.List[str] = ['StudyDate', 'StudyTime', 'SeriesDescription', 'PatientID', 'ProtocolName', 'StudyInstanceUID', 'SeriesInstanceUID', 'PatientName', 'PatientBirthDate', 'DeviceSerialNumber', 'AcquisitionDate', 'Modality', 'ImageType', 'SeriesNumber', 'StudyDescription', 'AccessionNumber', 'SequenceName', 'new_ids']) → None

Check if the csv table passed as input has only attributes that are allowed.

Parameters:

table – table containing all filters
allowed_filters – list of allowed attribute names

pacsifier.cli.pacsifier.get_parser() → ArgumentParser: Return the parser object for this script.

pacsifier.cli.pacsifier.main(): Main function of the script that calls retrieve_dicoms_using_table().

pacsifier.cli.pacsifier.parse_findscu_dump_file(filename: str) → List[Dict[str, str]]

Extract all useful information from the text file generated by dumping the output of the findscu command.

Parameters:: filename – path to textfile to be read
Returns:: list dictionaries each containing the attributes of a series
Return type:: list

pacsifier.cli.pacsifier.parse_query_table(table: <Mock name='mock.DataFrame' id='140068451577392'>, allowed_filters: ~typing.List[str] = ['StudyDate', 'StudyTime', 'SeriesDescription', 'PatientID', 'ProtocolName', 'StudyInstanceUID', 'SeriesInstanceUID', 'PatientName', 'PatientBirthDate', 'DeviceSerialNumber', 'AcquisitionDate', 'Modality', 'ImageType', 'SeriesNumber', 'StudyDescription', 'AccessionNumber', 'SequenceName', 'new_ids']) → List[Dict[str, str]]

Take the query table passed as input to the script and parse it using attributes in the query / retrieve command.

Parameters:

table – input csv table
allowed_filters – list of allowed attributes to be filtered

Returns:

list of dictionaries each containing the corresponding value: of each attribute in allowed_filters (in the case an attribute has no corresponding column in the csv table, an empty string is given)

Return type:

list

pacsifier.cli.pacsifier.process_person_names(name: str) → str

Modify patient name for the query input.

It convert all characters to uppercase, prepends a * to the last name and returns the new name.

Parameters:: name – patient’s name
Returns:: patient name in the format it will be used in the query
Return type:: string

pacsifier.cli.pacsifier.readLineByLine(filename: str) → Iterator[str]

Return a list of lines of a text file located at the path filename.

Parameters:: filename – path to text file to be read
Yields:: Iterator – list of text lines of the file

pacsifier.cli.pacsifier.retrieve_dicoms_using_table(table: <Mock name='mock.DataFrame' id='140068451577392'>, parameters: ~typing.Dict[str, str], output_dir: str, save: bool, info: bool, move: bool, resume: bool = False, verbose: bool = False) → None

Query and retrieve dicom images or / and their info dumps using the input query table.

Parameters:

table – query table
parameters – query/retrieve parameters
output_dir – path to the output directory
save – option to save the images
info – option to save info dumps
move – option to move images to remote destination
resume – option to skip already downloaded series

pacsifier.cli.pacsifier.upload_dicoms(dicom_dir: str, parameters: Dict[str, str]) → None

Upload dicoms to a PACS server.

Parameters:

dicom_dir –
path to the directory containing the dicoms. The directory should adopt the structure adopted by PACSIFIER output directory when using the –save command. This means that it should contain a subdirectory for each patient, which in turn should contain a subdirectory for each study, which in turn should contain a subdirectory for each series, such as:

dicom_dir
├── sub-01 │ ├── ses-01 │ │ ├── 00001-First_series │ │ │ ├── image1.dcm │ │ │ ├── image2.dcm │ │ │ ├── … │ │ ├── 00002-Second_series │ │ │ ├── image1.dcm │ │ │ ├── image2.dcm │ │ │ ├── … ├── sub-02 │ ├── ses-01 │ │ ├── 00001-First_series │ │ │ ├── image1.dcm
parameters – parameters from PACSIFIER configuration file

`pacsifier.cli.anonymize_dicoms`

Script to anonymize DICOM files for subsequent upload to Kheops.

pacsifier.cli.anonymize_dicoms.anonymize_all_dicoms_within_root_folder(output_folder: str = '.', datapath: str = './data', pattern_dicom_files: str = 'ses-*/*/*', new_ids: str | None = None, rename_patient_directories: bool = True, delete_identifiable_files: bool = True, remove_private_tags: bool = False, fuzz_acq_dates: bool = False) → Dict[str, str][source]

Anonymizes all dicom images located at the datapath in the structure specified by pattern_dicom_files parameter.

Parameters:

output_folder – path where anonymized images will be located
datapath – path to the dicom images
pattern_dicom_files – (generic) path to the dicom images starting from the patient folder (in a PACSIFIER dump, this would reflect e.g. ses-20170115/0002-MPRAGE/*.dcm)
new_ids – anonymous ids to be set after anonymizing the original ids
rename_patient_directories – rename patient directories using the anonymized ids if True
delete_identifiable_files – delete DICOM Series which have identifiable information in the image data itself if True (in the case of screen savings coming from the GE Revolution CT machine, which have the patient name embedded for example)
remove_private_tags – remove all private tags if True
fuzz_acq_dates – shift the acquisition-related dates randomly by +- 30 days if True

Returns:

dictionary keeping track of the new patientIDs and old patientIDs mappings

Return type:

dict

pacsifier.cli.anonymize_dicoms.anonymize_dicom_file(filename: str, output_filename: str, PatientID: str, new_StudyInstanceUID: str, new_SeriesInstanceUID: str, new_SOPInstanceUID: str, fuzz_birthdate: bool = True, fuzz_acqdates: bool = False, fuzz_days_shift: int = 0, delete_identifiable_files: bool = False, remove_private_tags: bool = False) → None[source]

Anonymize the dicom image located at filename by affecting patient id, patient name and date.

If identifiable data is present, deletes the file.

Parameters:

filename – path to dicom image
output_filename – output path of anonymized image
PatientID – the new patientID after anonymization
new_StudyInstanceUID – study instance UID to be used for depersonalisation. This should be a DICOM VR UI
new_SeriesInstanceUID – series instance UID to be used for depersonalisation. This should be a DICOM VR UI
new_SOPInstanceUID – SOP instance UID to be used for depersonalisation. This should be a DICOM VR UI
fuzz_birthdate – if True, to fuzz the birthdate or not
fuzz_acqdates – if True, fuzz acquisition-related dates including study date, InstanceCreationDate, SeriesDate, AcquisitionDate, ContentDate, PerformedProcedureStepStartDate, and (07a3,101b) ST (e.g. 201703251500), (07a3,1020) DA
fuzz_days_shift – number of days to shift dates (birth and various acquisition dates) by (can be positive or negative)
delete_identifiable_files – if True, delete DICOM Series which have identifiable information in the image data itself (in the case of SCREEN SAVE image type for dose reports coming from the GE Revolution CT machine, which have the patient name embedded, and from Toshiba/Canon Aquilion Prime, although these don’t have SCREEN SAVE label in ImageType tag)
remove_private_tags – if True remove all private tags

pacsifier.cli.anonymize_dicoms.fuzz_date(date: str, fuzz_parameter: int = 30) → Tuple[str, int][source]

Fuzz a date in a range of fuzz_parameter days prior to fuzz_parameter days after.

Parameters:

date – date in YYYYMMDD format
fuzz_parameter – the number of days by which the date will be fuzzed

Returns:

new fuzzed date fuzz: number of days used in offset (can be positive or negative)

Return type:

str_date

pacsifier.cli.anonymize_dicoms.get_parser() → ArgumentParser[source]: Get parser object for command line arguments of the script.

pacsifier.cli.anonymize_dicoms.main()[source]: Main function of the script that calls anonymize_all_dicoms_within_root_folder().

pacsifier.cli.anonymize_dicoms.parse_date(date: str) → Tuple[int, int, int][source]

Extract year, month, day from a date.

Parameters:: date – date in YYYYMMDD format
Returns:: year, month, day

pacsifier.cli.anonymize_dicoms.shift_date_by_some_days(date_str: str, shift: int) → str[source]

Add or subtract days from a date.

Parameters:

date – date in YYYYMMDD format
shift – the number of days by which the date will be shifted (can be positive or negative

Returns:

shifted date

Return type:

new_date_str

`pacsifier.cli.create_dicomdir`

Script to create a DICOMDIR of all dicoms within a folder.

pacsifier.cli.create_dicomdir.add_or_retrieve_name(current_folder: str, old_2_new: Dict[str, str]) → Tuple[str, Dict[str, str]][source]

Check if the current folder has had a generated new name. If that is the case, return its new name, otherwise, generate a new name.

Parameters:

current_folder – current folder to be considered
old_2_new – dictionary keeping track of mapping between old and new folder / file names

Returns:

tuple containing the new name and the updated mapping between old and new name.

Return type:

tuple

pacsifier.cli.create_dicomdir.create_dicomdir(out_path: str) → None[source]

Create a DICOMDIR of all dicoms with the path passed as parameter.

Parameters:: out_path – path of dicoms

pacsifier.cli.create_dicomdir.generate_new_folder_name(names: List[str] = []) → str[source]

Generate a folder/file name having between 4 and 8 characters of capital letters and digits.

Parameters:: names – new names already generated for other folders.
Returns:: Generated folder/file name.
Return type:: str

pacsifier.cli.create_dicomdir.get_parser() → ArgumentParser[source]: Get parser object for command line arguments of the script.

pacsifier.cli.create_dicomdir.main()[source]: Main function of the script that calls move_and_rename_files() and create_dicomdir().

pacsifier.cli.create_dicomdir.move_and_rename_files(dicom_path: str, output_path: str) → None[source]

Copy all the files within the dicom hierarchy into new hierarchy with appropriate names for DICOMDIR creation.

Parameters:

dicom_path – current folder to be considered
output_path – path where the new dicom hierarchy will be stored

`pacsifier.cli.get_pseudonyms`

Script to get the new pseudonyms and day shifts in JSON format.

The script can be used in two modes:

de-id: use the de-ID API to get new pseudonyms and day shifts
custom: use a custom mapping file in CSV format that specifies the mapping of old / new pseudonyms

In case of the de-id mode, the script requires a PACSIFIER query file and a configuration file for the de-ID API. In case of the custom mode, the script requires a custom mapping file in CSV format.

The script saves the new pseudonyms and day shifts as JSON files in the specified output directory.

Example usage:

python get_pseudonyms.py --mode de-id --config config.json \
    --queryfile query.csv --project_name PACSIFIERCohort \
    --out_directory /path/to/output
python get_pseudonyms.py --mode custom --mappingfile mapping.csv \
    --shift-days --project_name PACSIFIERCohort \
    --out_directory /path/to/output

pacsifier.cli.get_pseudonyms.check_config_file_deid(config_file: Dict[str, str]) → None[source]

Check that the config file passed as a parameter is valid.

Parameters:: config_file – dictionary loaded from the config json file

pacsifier.cli.get_pseudonyms.check_queryfile_content(queryfile: str) → None[source]

Check that the PACSIFIER query file is valid.

Parameters:: queryfile – the path of the PACSIFIER query file

pacsifier.cli.get_pseudonyms.convert_csv_to_deid_json(queryfile: str, project_name: str) → Dict[str, Any][source]

Convert PACSIFIER query to json format the de-ID API can understand.

Parameters:

queryfile – the filename of the PACSIFIER query file
project_name – the name of the project in GPCR (may or may not correspond to Kheops album)

Returns:

JSON object suitable for the API

pacsifier.cli.get_pseudonyms.generate_csv_with_pseudonyms_and_day_shifts(queryfile: str, pseudonyms: Dict[str, str], day_shifts: Dict[str, int], output_dir: str) → None[source]

Create a CSV file with the original query file columns, new pseudonyms, and day shifts.

Parameters:

queryfile – path to the original PACSIFIER query file
pseudonyms – dictionary mapping old Patient IDs to new pseudonyms
day_shifts – dictionary mapping old Patient IDs to day shifts
output_dir – path to save the resulting CSV file

pacsifier.cli.get_pseudonyms.get_deid_day_shifts(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) → str[source]

Run the de-ID request for day shifts and return the response as a json.

Parameters:

deid_parameters –

dictionary containing the de-ID URL and token in the following format:

{
    "deid_URL": "https://dummy.url.example",
    "deid_token": "1234567890"
}

query_json – the PACSIFIER query formatted as a dictionary

Returns:

JSON object containing the day shifts for each patient

pacsifier.cli.get_pseudonyms.get_deid_pseudonyms(deid_parameters: Dict[str, str], query_json: Dict[str, Any]) → str[source]

Run the de-ID request and return the response as a json.

Parameters:

deid_parameters –

dictionary containing the de-ID URL and token in the following format:

{
    "deid_URL": "https://dummy.url.example",
    "deid_token": "1234567890"
}

query_json – the PACSIFIER query formatted as a dictionary

Returns:

JSON object containing the new pseudonyms for each patient

pacsifier.cli.get_pseudonyms.get_parser() → ArgumentParser[source]: Get parser for command line arguments.

pacsifier.cli.get_pseudonyms.main()[source]: Main function of the script.

pacsifier.cli.get_pseudonyms.split_deid_query_json_in_batch(deid_query_json: Dict[str, Any], batch_size: int = 500) → List[Dict[str, Any]][source]

Split the patients provided in parameter in several batch of batch_size length.

Parameters:

deid_query_json – dictionary loaded from the deid json file
batch_size – The size of one batch

Returns:

List of dictionaries, each containing a batch of patients with the project information

`pacsifier.cli.move_dumps`

Script to move all csv files retrieved by pacsifier --info ... into a new folder.

pacsifier.cli.move_dumps.get_parser() → ArgumentParser[source]: Get parser for command line arguments.

pacsifier.cli.move_dumps.main()[source]: Main function of the script that calls move().

pacsifier.cli.move_dumps.move(dicom_path: str, output_path: str) → None[source]

Move all csv info files within a dicom directory into a new directory.

Parameters:

dicom_path – path to the folder containing dicoms.
output_path – path where the csv files within the dicom path will be moved.

`pacsifier.cli.add_karnak_tags`

Add private DICOM tags to several studies so that Karnak can de-identify them using provided patient codes and route them to the appropriate Kheops album.

pacsifier.cli.add_karnak_tags.get_parser() → ArgumentParser[source]: Get parser object for command line arguments of the script.

pacsifier.cli.add_karnak_tags.main()[source]: Main function of the script that calls tag_all_dicoms_within_root_folder().

pacsifier.cli.add_karnak_tags.tag_all_dicoms_within_root_folder(data_path: str, new_ids: Dict[str, str], day_shift: Dict[str, str], album_name: str) → None[source]

Tag all dicom images located at the datapath for Karnak, adding an album name and patientCode private tags.

Parameters:

data_path – path to the dicom images
new_ids – real:code mapping to be used after de-identifying the original ids
day_shift – day shift per patient
album_name – name of the Kheops album

pacsifier.cli.add_karnak_tags.tag_dicom_file(filename: str, patient_code: str, patient_shift: str, album_name: str) → None[source]

Tag the dicom image located at filename by adding patient code and Kheops album name to private tags for subsequent de-identification.

Parameters:

filename – path to dicom image
patient_code – pseudonymous patient code
album_name – Kheops album name

`pacsifier.cli.extract_carestream_report`

Script to extract plain text from Carestream radiology reports in SR.

pacsifier.cli.extract_carestream_report.extract_txt_report(data_folder: str) → None[source]

This function loops over a BIDS-like (Brain Imaging Data Structure) dataset.

If some SRc files are found, it converts them to txt files and saves them in the same directory.

Note

The function assumes that each subject is stored as ~/.../sub-XXXXXX/ses-YYYYYYYYYYYYY/00001-CarestreamPACSReports/ SRc.x.x.x.

Parameters:: data_folder (str) – path to BIDS-like dataset

pacsifier.cli.extract_carestream_report.get_parser() → ArgumentParser[source]: Get parser object for command line arguments of the script.

Note

It is assumed that each subject is stored as ~/.../sub-XXXXXX/ses-YYYYYYYYYYYYY/00001-CarestreamPACSReports.

pacsifier.cli.extract_carestream_report.main()[source]: Main function of the script that calls extract_txt_report().

pacsifier.cli.extract_carestream_report.replace_special_char_combinations(input_report, print_clean_report=False) → str[source]

pacsifier.cli

pacsifier.cli.pacsifier

pacsifier.cli.anonymize_dicoms

pacsifier.cli.create_dicomdir

pacsifier.cli.get_pseudonyms

pacsifier.cli.move_dumps

pacsifier.cli.add_karnak_tags

pacsifier.cli.extract_carestream_report

`pacsifier.cli`

`pacsifier.cli.pacsifier`

`pacsifier.cli.anonymize_dicoms`

`pacsifier.cli.create_dicomdir`

`pacsifier.cli.get_pseudonyms`

`pacsifier.cli.move_dumps`

`pacsifier.cli.add_karnak_tags`

`pacsifier.cli.extract_carestream_report`