opening a ieeei4 file · general · Zulip Chat Archive

Hi, I'm looking to open a topography file with the name topography_km62_201209.ieeei4
I've never opened this file type, and not much is coming up online that is helpful. Does anyone know a quick way to read these files with python? Thanks in advance!

Katelyn FitzGerald (Nov 14 2023 at 22:43):

I don't have an example (some others around here might), but you could try using something like numpy.fromfile.

Michael Levy (Nov 14 2023 at 22:54):

import numpy as np
data = np.fromfile('topography_km62_201209.ieeei4', dtype='>i')

the i in dtype = '>i' tells numpy to read in 4-byte (32-bit) integers, and the > tells it to use big-endian (the file name looks like a CESM input file, which are all big-endian by convention). I would then look at np.min(data) and np.max(data) to make sure the range looks reasonable -- if they look off, try dtype='i' instead. I think little-endian is more common for binary files outside of the CESM world. You should also check np.shape(data) to make sure it is the size you were expecting.

Note that this will give you a 1D array of data, so you may need to reshape it (data2 = np.reshape(data, (...)), where (...) is an array of the new dim sizes).

Mira Berdahl (Nov 15 2023 at 18:38):

This is great @Michael Levy & @Katelyn FitzGerald thanks for the fast responses!
I just tried and can open the file, but its 1D as you suggested. Unfortunately, I don't know what dimensions the data ought to be though, so can't reshape it. Any idea how to glean this info?

Michael Levy (Nov 15 2023 at 19:08):

Can you give a little background on what this file is and how it is used? Also, how big is the 1D array?

Mira Berdahl (Nov 16 2023 at 17:12):

@Michael Levy , the file is a bathymetry file used in a 0.1deg ocean CESM simulation. The shape of the binary file is (8640000,). Thanks!

Michael Levy (Nov 16 2023 at 17:30):

The 0.1 degree grid is 3600 x 2400 -- Iwould try np.reshape(data, (3600, 2400)) and then plot it -- the region where the value is 0 should look like the continents, but if it doesn't then you'll want np.reshape(data, (2400, 3600))

Mira Berdahl (Nov 16 2023 at 19:39):

@Michael Levy This works, thanks. It needed to be of the shape (2400,3600). I am a bit confused, though, since I assumed this file would actually show the bathymetry (in meters), but instead it looks a lot like the KMT field, where the values are 0-62, corresponding to depth levels of the POP grid. Do you have any thoughts on this? Thanks.

Michael Levy (Nov 16 2023 at 20:15):

@Mira Berdahl the data is all integer and it corresponds to the number of active ocean levels at that location. For the 1 degree grid, there is a 1D array of cell-center depths associated with each layer and then you can use the index contained in the bathymetry file to find the depth at that column. For the 0.1 degree ocean simulation, we typically run with "partial bottom cells" turned on and that requires a separate 2D file containing the actual depth. If you have access to the case root for the CESM run itself, look in ${CASE_ROOT}/CaseDocs/pop_in and see what bottom_cell_file is set to. For the runs I've done with that grid, the default value is ${DIN_ROOT_LOC}/ocn/pop/tx0.1v3/grid/dzbc_pbc_s2.0_20171019.ieeer8, and you can read it in with data = np.reshape(np.fromfile('/path/to/file.ieeer8', '>d'), (2400, 3600)) (> for big-endian, and d instead of i since this is double-precision real values instead of integer values). Again, double check the min and max to make sure it read in correctly, and plot it to make sure the reshape was ordered right (but it should list columns in the same order as the bathymetry file). I think you're right about the units, but the max value will also make it obvious if the units are centimeters instead of meters

Michael Levy (Nov 18 2023 at 00:16):

@Mira Berdahl to fill in anyone who stumbles on this thread later, it sounds like what you really want to do is read the bathymetry used in a POP run. If you have access to any of the POP history (or time series) output files, there is a field HT that will give you ocean depth at a given cell. So bathymetry = xr.open_dataset(pop_netcdf_output)['HT'] should do the trick.

If you want to construct the bathymetry ahead of time, we were on the right track. The first file you read from the CESM input data repository (the .ieeei4 file) tell us how many active ocean layers each grid cell has. The second file you read in (the .ieeer8 file) is also necessary, but I didn't describe it correctly -- the dzbc_pbc file actually tells you the layer thickness (in centimeters) of the bottom most active layer. So the last thing we need is to know the expected layer thicknesses: then, for each cell, we replace the bottom-most thickness (index based on the value in the bathymetry file) with the value from the dzbc_pbc file and then sum over the thicknesses of the active cells to get the total depth.

The thicknesses of each layer (not considering the partial bottom cells) come from a file in the POP repository. Rather than processing it yourself, you can trust me when I tell you

dz = [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1019.6808, 1056.4484, 1105.9951, 1167.807, 1242.4133, 1330.9678, 1435.141, 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521, 2556.2471, 2870.575, 3240.8372, 3677.7725, 4194.0308, 4804.2236, 5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658, 12967.1992, 14705.3438, 16480.709, 18209.1348, 19802.2344, 21185.957, 22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137, 24844.3281, 24911.9746, 24951.291, 24973.5938, 24985.9609, 24992.6738, 24996.2441, 24998.1094, 25000.0, 25000.0]

num_levels = np.reshape(np.fromfile('topography_km62_201209.ieeei4', dtype='>i'), (2400, 3600))
dzbc = np.reshape(np.fromfile('/path/to/file.ieeer8', '>d'), (2400, 3600))
dz = [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1019.6808, 1056.4484, 1105.9951, 1167.807, 1242.4133, 1330.9678, 1435.141, 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521, 2556.2471, 2870.575, 3240.8372, 3677.7725, 4194.0308, 4804.2236, 5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658, 12967.1992, 14705.3438, 16480.709, 18209.1348, 19802.2344, 21185.957, 22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137, 24844.3281, 24911.9746, 24951.291, 24973.5938, 24985.9609, 24992.6738, 24996.2441, 24998.1094, 25000.0, 25000.0]

bathymetry = np.zeros((2400, 3600))
for k in range(62):
  bathymetry = bathymetry + np.where(k<num_levels-1, dz(k), np.where(k == num_levels-1, dzbc, 0))

Note that my network connection is a little flaky so I haven't had a chance to test the above, but the general gist should be right (and it should match HT from a history file). And both of those values will be in cm, not m.

Michael Levy (Nov 18 2023 at 01:04):

Despite @Keith Lindsay reminding me to be careful, I'm pretty sure the above code snippet has an off-by-one error because Fortran (and topography_km62_201209.ieeei4 start counting at 1 but the python array indices start counting at 0... now that my network is more stable, I'll whip up a quick jupyter notebook that runs on Casper and makes sure everything is correct

Michael Levy (Nov 18 2023 at 01:33):

Okay, it looks like my code above is correct, except I have a dz(k) that should really be dz[k] :) Here's a nicer formatting:

inputdata = os.path.join(os.sep,
                         os.environ['CESMDATAROOT'],
                         'inputdata',
                         'ocn',
                         'pop',
                         'tx0.1v3',
                         'grid',
                        )

# Set dz
dz = [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0,
      1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0,
      1019.6808, 1056.4484, 1105.9951, 1167.807, 1242.4133, 1330.9678,
      1435.141, 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521,
      2556.2471, 2870.575, 3240.8372, 3677.7725, 4194.0308, 4804.2236,
      5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658,
      12967.1992, 14705.3438, 16480.709, 18209.1348, 19802.2344, 21185.957,
      22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137,
      24844.3281, 24911.9746, 24951.291, 24973.5938, 24985.9609, 24992.6738,
      24996.2441, 24998.1094, 25000.0, 25000.0]

# Read topography file and dzbc
num_levels = np.reshape(np.fromfile(os.path.join(inputdata, 'topography_20170718.ieeei4'), dtype='>i'), (2400, 3600))
dzbc = np.reshape(np.fromfile(os.path.join(inputdata, 'dzbc_pbc_s2.0_20171019.ieeer8'), '>d'), (2400, 3600))

# Compute bathymetry
bathymetry = np.zeros((2400, 3600))
for k in range(62):
    bathymetry = bathymetry + np.where(k<num_levels-1, dz[k], np.where(k == num_levels-1, dzbc, 0))

I also read in HT from an earlier run and verified that HT - bathymetry is zero everywhere

Stream: general

Topic: opening a ieeei4 file

Mira Berdahl (Nov 14 2023 at 22:24):