Stream: general

Topic: opening a ieeei4 file


view this post on Zulip Mira Berdahl (Nov 14 2023 at 22:24):

Hi, I'm looking to open a topography file with the name topography_km62_201209.ieeei4
I've never opened this file type, and not much is coming up online that is helpful. Does anyone know a quick way to read these files with python? Thanks in advance!

view this post on Zulip Katelyn FitzGerald (Nov 14 2023 at 22:43):

Mira Berdahl said:

Hi, I'm looking to open a topography file with the name topography_km62_201209.ieeei4
I've never opened this file type, and not much is coming up online that is helpful. Does anyone know a quick way to read these files with python? Thanks in advance!

Looks like a binary file extension specific to the NCAR climate modeling world.

I don't have an example (some others around here might), but you could try using something like numpy.fromfile.

view this post on Zulip Michael Levy (Nov 14 2023 at 22:54):

Yeah, I was just testing out the arguments but I think you want

import numpy as np
data = np.fromfile('topography_km62_201209.ieeei4', dtype='>i')

the i in dtype = '>i' tells numpy to read in 4-byte (32-bit) integers, and the > tells it to use big-endian (the file name looks like a CESM input file, which are all big-endian by convention). I would then look at np.min(data) and np.max(data) to make sure the range looks reasonable -- if they look off, try dtype='i' instead. I think little-endian is more common for binary files outside of the CESM world. You should also check np.shape(data) to make sure it is the size you were expecting.

Note that this will give you a 1D array of data, so you may need to reshape it (data2 = np.reshape(data, (...)), where (...) is an array of the new dim sizes).

view this post on Zulip Mira Berdahl (Nov 15 2023 at 18:38):

This is great @Michael Levy & @Katelyn FitzGerald thanks for the fast responses!
I just tried and can open the file, but its 1D as you suggested. Unfortunately, I don't know what dimensions the data ought to be though, so can't reshape it. Any idea how to glean this info?

view this post on Zulip Michael Levy (Nov 15 2023 at 19:08):

Mira Berdahl said:

I just tried and can open the file, but its 1D as you suggested. Unfortunately, I don't know what dimensions the data ought to be though, so can't reshape it. Any idea how to glean this info?

Can you give a little background on what this file is and how it is used? Also, how big is the 1D array?

view this post on Zulip Mira Berdahl (Nov 16 2023 at 17:12):

@Michael Levy , the file is a bathymetry file used in a 0.1deg ocean CESM simulation. The shape of the binary file is (8640000,). Thanks!

view this post on Zulip Michael Levy (Nov 16 2023 at 17:30):

The 0.1 degree grid is 3600 x 2400 -- Iwould try np.reshape(data, (3600, 2400)) and then plot it -- the region where the value is 0 should look like the continents, but if it doesn't then you'll want np.reshape(data, (2400, 3600))

view this post on Zulip Mira Berdahl (Nov 16 2023 at 19:39):

@Michael Levy This works, thanks. It needed to be of the shape (2400,3600). I am a bit confused, though, since I assumed this file would actually show the bathymetry (in meters), but instead it looks a lot like the KMT field, where the values are 0-62, corresponding to depth levels of the POP grid. Do you have any thoughts on this? Thanks.

view this post on Zulip Michael Levy (Nov 16 2023 at 20:15):

@Mira Berdahl the data is all integer and it corresponds to the number of active ocean levels at that location. For the 1 degree grid, there is a 1D array of cell-center depths associated with each layer and then you can use the index contained in the bathymetry file to find the depth at that column. For the 0.1 degree ocean simulation, we typically run with "partial bottom cells" turned on and that requires a separate 2D file containing the actual depth. If you have access to the case root for the CESM run itself, look in ${CASE_ROOT}/CaseDocs/pop_in and see what bottom_cell_file is set to. For the runs I've done with that grid, the default value is ${DIN_ROOT_LOC}/ocn/pop/tx0.1v3/grid/dzbc_pbc_s2.0_20171019.ieeer8, and you can read it in with data = np.reshape(np.fromfile('/path/to/file.ieeer8', '>d'), (2400, 3600)) (> for big-endian, and d instead of i since this is double-precision real values instead of integer values). Again, double check the min and max to make sure it read in correctly, and plot it to make sure the reshape was ordered right (but it should list columns in the same order as the bathymetry file). I think you're right about the units, but the max value will also make it obvious if the units are centimeters instead of meters

view this post on Zulip Michael Levy (Nov 18 2023 at 00:16):

@Mira Berdahl to fill in anyone who stumbles on this thread later, it sounds like what you really want to do is read the bathymetry used in a POP run. If you have access to any of the POP history (or time series) output files, there is a field HT that will give you ocean depth at a given cell. So bathymetry = xr.open_dataset(pop_netcdf_output)['HT'] should do the trick.

If you want to construct the bathymetry ahead of time, we were on the right track. The first file you read from the CESM input data repository (the .ieeei4 file) tell us how many active ocean layers each grid cell has. The second file you read in (the .ieeer8 file) is also necessary, but I didn't describe it correctly -- the dzbc_pbc file actually tells you the layer thickness (in centimeters) of the bottom most active layer. So the last thing we need is to know the expected layer thicknesses: then, for each cell, we replace the bottom-most thickness (index based on the value in the bathymetry file) with the value from the dzbc_pbc file and then sum over the thicknesses of the active cells to get the total depth.

The thicknesses of each layer (not considering the partial bottom cells) come from a file in the POP repository. Rather than processing it yourself, you can trust me when I tell you

dz = [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1019.6808, 1056.4484, 1105.9951, 1167.807, 1242.4133, 1330.9678, 1435.141, 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521, 2556.2471, 2870.575, 3240.8372, 3677.7725, 4194.0308, 4804.2236, 5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658, 12967.1992, 14705.3438, 16480.709, 18209.1348, 19802.2344, 21185.957, 22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137, 24844.3281, 24911.9746, 24951.291, 24973.5938, 24985.9609, 24992.6738, 24996.2441, 24998.1094, 25000.0, 25000.0]

So I think something like

num_levels = np.reshape(np.fromfile('topography_km62_201209.ieeei4', dtype='>i'), (2400, 3600))
dzbc = np.reshape(np.fromfile('/path/to/file.ieeer8', '>d'), (2400, 3600))
dz = [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1019.6808, 1056.4484, 1105.9951, 1167.807, 1242.4133, 1330.9678, 1435.141, 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521, 2556.2471, 2870.575, 3240.8372, 3677.7725, 4194.0308, 4804.2236, 5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658, 12967.1992, 14705.3438, 16480.709, 18209.1348, 19802.2344, 21185.957, 22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137, 24844.3281, 24911.9746, 24951.291, 24973.5938, 24985.9609, 24992.6738, 24996.2441, 24998.1094, 25000.0, 25000.0]

bathymetry = np.zeros((2400, 3600))
for k in range(62):
  bathymetry = bathymetry + np.where(k<num_levels-1, dz(k), np.where(k == num_levels-1, dzbc, 0))

Note that my network connection is a little flaky so I haven't had a chance to test the above, but the general gist should be right (and it should match HT from a history file). And both of those values will be in cm, not m.

view this post on Zulip Michael Levy (Nov 18 2023 at 01:04):

Despite @Keith Lindsay reminding me to be careful, I'm pretty sure the above code snippet has an off-by-one error because Fortran (and topography_km62_201209.ieeei4 start counting at 1 but the python array indices start counting at 0... now that my network is more stable, I'll whip up a quick jupyter notebook that runs on Casper and makes sure everything is correct

view this post on Zulip Michael Levy (Nov 18 2023 at 01:33):

Okay, it looks like my code above is correct, except I have a dz(k) that should really be dz[k] :) Here's a nicer formatting:

inputdata = os.path.join(os.sep,
                         os.environ['CESMDATAROOT'],
                         'inputdata',
                         'ocn',
                         'pop',
                         'tx0.1v3',
                         'grid',
                        )

# Set dz
dz = [1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0,
      1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0, 1000.0,
      1019.6808, 1056.4484, 1105.9951, 1167.807, 1242.4133, 1330.9678,
      1435.141, 1557.1259, 1699.6796, 1866.2124, 2060.9023, 2288.8521,
      2556.2471, 2870.575, 3240.8372, 3677.7725, 4194.0308, 4804.2236,
      5524.7544, 6373.1919, 7366.9448, 8520.8926, 9843.6582, 11332.4658,
      12967.1992, 14705.3438, 16480.709, 18209.1348, 19802.2344, 21185.957,
      22316.5098, 23186.4941, 23819.4492, 24257.2168, 24546.7793, 24731.0137,
      24844.3281, 24911.9746, 24951.291, 24973.5938, 24985.9609, 24992.6738,
      24996.2441, 24998.1094, 25000.0, 25000.0]

# Read topography file and dzbc
num_levels = np.reshape(np.fromfile(os.path.join(inputdata, 'topography_20170718.ieeei4'), dtype='>i'), (2400, 3600))
dzbc = np.reshape(np.fromfile(os.path.join(inputdata, 'dzbc_pbc_s2.0_20171019.ieeer8'), '>d'), (2400, 3600))

# Compute bathymetry
bathymetry = np.zeros((2400, 3600))
for k in range(62):
    bathymetry = bathymetry + np.where(k<num_levels-1, dz[k], np.where(k == num_levels-1, dzbc, 0))

I also read in HT from an earlier run and verified that HT - bathymetry is zero everywhere

view this post on Zulip Mira Berdahl (Nov 21 2023 at 01:38):

This is so helpful, thanks so much @Michael Levy !


Last updated: May 16 2025 at 17:14 UTC