LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption
@ 2022-02-03 16:43 Martin Fernandez
  2022-02-03 16:43 ` [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities Martin Fernandez
                   ` (5 more replies)
  0 siblings, 6 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

Show for each node if every memory descriptor in that node has the
EFI_MEMORY_CPU_CRYPTO attribute.

fwupd project plans to use it as part of a check to see if the users
have properly configured memory hardware encryption
capabilities. fwupd's people have seen cases where it seems like there
is memory encryption because all the hardware is capable of doing it,
but on a closer look there is not, either because of system firmware
or because some component requires updating to enable the feature.

It's planned to make it part of a specification that can be passed to
people purchasing hardware

These checks will run at every boot. The specification is called Host
Security ID: https://fwupd.github.io/libfwupdplugin/hsi.html.

We choosed to do it a per-node basis because although an ABI that
shows that the whole system memory is capable of encryption would be
useful for the fwupd use case, doing it in a per-node basis gives also
the capability to the user to target allocations from applications to
NUMA nodes which have encryption capabilities.


Changes since v5:

Refactor e820__range_{update, remove, set_crypto_capable} in order to
avoid code duplication.

Warn the user when a node has both encryptable and non-encryptable
regions.

Check that e820_table has enough size to store both current e820_table
and EFI memmap.


Changes since v4:

Add enum to represent the cryptographic capabilities in e820:
e820_crypto_capabilities.

Revert __e820__range_update, only adding the new argument for
__e820__range_add about crypto capabilities.

Add a function __e820__range_update_crypto similar to
__e820__range_update but to only update this new field.


Changes since v3:

Update date in Doc/ABI file.

More information about the fwupd usecase and the rationale behind
doing it in a per-NUMA-node.


Changes since v2:

e820__range_mark_crypto -> e820__range_mark_crypto_capable.

In e820__range_remove: Create a region with crypto capabilities
instead of creating one without it and then mark it.


Changes since v1:

Modify __e820__range_update to update the crypto capabilities of a
range; now this function will change the crypto capability of a range
if it's called with the same old_type and new_type. Rework
efi_mark_e820_regions_as_crypto_capable based on this.

Update do_add_efi_memmap to mark the regions as it creates them.

Change the type of crypto_capable in e820_entry from bool to u8.

Fix e820__update_table changes.

Remove memblock_add_crypto_capable. Now you have to add the region and
mark it then.

Better place for crypto_capable in pglist_data.


Martin Fernandez (6):
  mm/memblock: Tag memblocks with crypto capabilities
  mm/mmzone: Tag pg_data_t with crypto capabilities
  x86/e820: Refactor range_update and range_remove
  x86/e820: Tag e820_entry with crypto capabilities
  x86/efi: Tag e820_entries as crypto capable from EFI memmap
  drivers/node: Show in sysfs node's crypto capabilities

 Documentation/ABI/testing/sysfs-devices-node |  10 +
 arch/x86/include/asm/e820/api.h              |   1 +
 arch/x86/include/asm/e820/types.h            |  12 +-
 arch/x86/kernel/e820.c                       | 485 +++++++++++++++----
 arch/x86/platform/efi/efi.c                  |  37 ++
 drivers/base/node.c                          |  10 +
 include/linux/memblock.h                     |  15 +-
 include/linux/mmzone.h                       |   3 +
 mm/memblock.c                                |  64 +++
 mm/page_alloc.c                              |   1 +
 10 files changed, 531 insertions(+), 107 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-node

-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities
  2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
@ 2022-02-03 16:43 ` Martin Fernandez
  2022-02-03 18:07   ` Mike Rapoport
  2022-02-07 21:18   ` Kees Cook
  2022-02-03 16:43 ` [PATCH v6 2/6] mm/mmzone: Tag pg_data_t " Martin Fernandez
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

Add the capability to mark regions of the memory memory_type able of
hardware memory encryption.

Also add the capability to query if all regions of a memory node are
able to do hardware memory encryption to call it when initializing the
nodes. Warn the user if a node has both encryptable and
non-encryptable regions.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 include/linux/memblock.h | 15 ++++++----
 mm/memblock.c            | 64 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 9dc7cb239d21..73edcce165a5 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -41,13 +41,15 @@ extern unsigned long long max_possible_pfn;
  * via a driver, and never indicated in the firmware-provided memory map as
  * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the
  * kernel resource tree.
+ * @MEMBLOCK_CRYPTO_CAPABLE: capable of hardware encryption
  */
 enum memblock_flags {
-	MEMBLOCK_NONE		= 0x0,	/* No special request */
-	MEMBLOCK_HOTPLUG	= 0x1,	/* hotpluggable region */
-	MEMBLOCK_MIRROR		= 0x2,	/* mirrored region */
-	MEMBLOCK_NOMAP		= 0x4,	/* don't add to kernel direct mapping */
-	MEMBLOCK_DRIVER_MANAGED = 0x8,	/* always detected via a driver */
+	MEMBLOCK_NONE		= 0x0,		/* No special request */
+	MEMBLOCK_HOTPLUG	= 0x1,		/* hotpluggable region */
+	MEMBLOCK_MIRROR		= 0x2,		/* mirrored region */
+	MEMBLOCK_NOMAP		= 0x4,		/* don't add to kernel direct mapping */
+	MEMBLOCK_DRIVER_MANAGED = 0x8,		/* always detected via a driver */
+	MEMBLOCK_CRYPTO_CAPABLE = 0x10,		/* capable of hardware encryption */
 };
 
 /**
@@ -121,6 +123,9 @@ int memblock_physmem_add(phys_addr_t base, phys_addr_t size);
 void memblock_trim_memory(phys_addr_t align);
 bool memblock_overlaps_region(struct memblock_type *type,
 			      phys_addr_t base, phys_addr_t size);
+bool memblock_node_is_crypto_capable(int nid);
+int memblock_mark_crypto_capable(phys_addr_t base, phys_addr_t size);
+int memblock_clear_crypto_capable(phys_addr_t base, phys_addr_t size);
 int memblock_mark_hotplug(phys_addr_t base, phys_addr_t size);
 int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
 int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
diff --git a/mm/memblock.c b/mm/memblock.c
index 1018e50566f3..fcf79befeab3 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -191,6 +191,42 @@ bool __init_memblock memblock_overlaps_region(struct memblock_type *type,
 	return i < type->cnt;
 }
 
+/**
+ * memblock_node_is_crypto_capable - get if whole node is capable
+ * of encryption
+ * @nid: number of node
+ *
+ * Iterate over all memory memblock_type and find if all regions under
+ * node @nid are capable of hardware encryption.
+ *
+ * Return:
+ * true if every region in memory memblock_type is capable of
+ * encryption, false otherwise.
+ */
+bool __init_memblock memblock_node_is_crypto_capable(int nid)
+{
+	struct memblock_region *region;
+	bool crypto_capable = false;
+	bool not_crypto_capable = false;
+
+	for_each_mem_region(region) {
+		if (memblock_get_region_node(region) == nid) {
+			crypto_capable =
+				crypto_capable ||
+				(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
+			not_crypto_capable =
+				not_crypto_capable ||
+				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
+		}
+	}
+
+	if (crypto_capable && not_crypto_capable)
+		pr_warn_once("Node %d has regions that are encryptable and regions that aren't",
+			     nid);
+
+	return !not_crypto_capable;
+}
+
 /**
  * __memblock_find_range_bottom_up - find free area utility in bottom-up
  * @start: start of candidate range
@@ -885,6 +921,34 @@ static int __init_memblock memblock_setclr_flag(phys_addr_t base,
 	return 0;
 }
 
+/**
+ * memblock_mark_crypto_capable - Mark memory regions capable of hardware
+ * encryption with flag MEMBLOCK_CRYPTO_CAPABLE.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int __init_memblock memblock_mark_crypto_capable(phys_addr_t base,
+						 phys_addr_t size)
+{
+	return memblock_setclr_flag(base, size, 1, MEMBLOCK_CRYPTO_CAPABLE);
+}
+
+/**
+ * memblock_clear_crypto_capable - Clear flag MEMBLOCK_CRYPTO for a
+ * specified region.
+ * @base: the base phys addr of the region
+ * @size: the size of the region
+ *
+ * Return: 0 on success, -errno on failure.
+ */
+int __init_memblock memblock_clear_crypto_capable(phys_addr_t base,
+						  phys_addr_t size)
+{
+	return memblock_setclr_flag(base, size, 0, MEMBLOCK_CRYPTO_CAPABLE);
+}
+
 /**
  * memblock_mark_hotplug - Mark hotpluggable memory with flag MEMBLOCK_HOTPLUG.
  * @base: the base phys addr of the region
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 2/6] mm/mmzone: Tag pg_data_t with crypto capabilities
  2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
  2022-02-03 16:43 ` [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities Martin Fernandez
@ 2022-02-03 16:43 ` Martin Fernandez
  2022-02-07 21:19   ` Kees Cook
  2022-02-03 16:43 ` [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove Martin Fernandez
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

Add a new member in the pg_data_t struct tell whether the node
corresponding to that pg_data_t is able to do hardware memory encryption.

This will be read from sysfs.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 include/linux/mmzone.h | 3 +++
 mm/page_alloc.c        | 1 +
 2 files changed, 4 insertions(+)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 936dc0b6c226..cec51e7a01d9 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -871,6 +871,9 @@ typedef struct pglist_data {
 	struct task_struct *kcompactd;
 	bool proactive_compact_trigger;
 #endif
+
+	bool crypto_capable;
+
 	/*
 	 * This is a per-node reserve of pages that are not available
 	 * to userspace allocations.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c5952749ad40..8bcbd6fa0089 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7592,6 +7592,7 @@ static void __init free_area_init_node(int nid)
 	pgdat->node_id = nid;
 	pgdat->node_start_pfn = start_pfn;
 	pgdat->per_cpu_nodestats = NULL;
+	pgdat->crypto_capable = memblock_node_is_crypto_capable(nid);
 
 	pr_info("Initmem setup node %d [mem %#018Lx-%#018Lx]\n", nid,
 		(u64)start_pfn << PAGE_SHIFT,
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
  2022-02-03 16:43 ` [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities Martin Fernandez
  2022-02-03 16:43 ` [PATCH v6 2/6] mm/mmzone: Tag pg_data_t " Martin Fernandez
@ 2022-02-03 16:43 ` Martin Fernandez
  2022-02-07 21:45   ` Kees Cook
  2022-02-08 21:04   ` Daniel Gutson
  2022-02-03 16:43 ` [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities Martin Fernandez
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

__e820__range_update and e820__range_remove had a very similar
implementation with a few lines different from each other, the lines
that actually perform the modification over the e820_table. The
similiraties were found in the checks for the different cases on how
each entry intersects with the given range (if it does at all). These
checks were very presice and error prone so it was not a good idea to
have them in both places.

I propose a refactor of those functions, given that I need to create a
similar one for this patchset.

Add a function to modify a E820 table in a given range. This
modification is done backed up by two helper structs:
e820_entry_updater and e820_*_data.

The first one, e820_entry_updater, carries 3 callbacks which function
as the actions to take on the table.

The other one, e820_*_data carries information needed by the
callbacks, for example in the case of range_update it will carry the
type that we are targeting.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 arch/x86/kernel/e820.c | 383 ++++++++++++++++++++++++++++++-----------
 1 file changed, 283 insertions(+), 100 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index bc0657f0deed..89b78c6b345b 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -459,144 +459,327 @@ static int __init append_e820_table(struct boot_e820_entry *entries, u32 nr_entr
 	return __append_e820_table(entries, nr_entries);
 }
 
+/**
+ * e820_entry_updater - Helper type for __e820__handle_range_update().
+ * @should_update: Return true if @entry needs to be updated, false
+ * otherwise.
+ * @update: Apply desired actions to an @entry that is inside the
+ * range and satisfies @should_update.
+ * @new: Create new entry in the table with information gathered from
+ * @original and @data.
+ *
+ * Each function corresponds to an action that
+ * __e820__handle_range_update() does. Callbacks need to cast @data back
+ * to the corresponding type.
+ */
+struct e820_entry_updater {
+	bool (*should_update)(const struct e820_entry *entry, const void *data);
+	void (*update)(struct e820_entry *entry, const void *data);
+	void (*new)(struct e820_table *table, u64 new_start, u64 new_size,
+		    const struct e820_entry *original, const void *data);
+};
+
+/**
+ * e820_remove_data - Helper type for e820__range_remove().
+ * @old_type: old_type parameter of e820__range_remove().
+ * @check_type: check_type parameter of e820__range_remove().
+ *
+ * This is intended to be used as the @data argument for the
+ * e820_entry_updater callbacks.
+ */
+struct e820_remover_data {
+	enum e820_type old_type;
+	bool check_type;
+};
+
+/**
+ * e820_type_updater_data - Helper type for __e820__range_update().
+ * @old_type: old_type parameter of __e820__range_update().
+ * @new_type: new_type parameter of __e820__range_update().
+ *
+ * This is intended to be used as the @data argument for the
+ * e820_entry_updater callbacks.
+ */
+struct e820_type_updater_data {
+	enum e820_type old_type;
+	enum e820_type new_type;
+};
+
+/**
+ * __e820__handle_intersected_range_update() - Helper function for
+ * __e820__handle_range_update().
+ * @table: Target e820_table.
+ * @start: Start of the range.
+ * @size: Size of the range.
+ * @entry: Current entry that __e820__handle_range_update() was
+ * looking into.
+ * @updater: updater parameter of __e820__handle_range_update().
+ * @data: data parameter of __e820__handle_range_update().
+ *
+ * Helper for __e820__handle_range_update to handle the case where
+ * neither the entry completely covers the range nor the range
+ * completely covers the entry.
+ *
+ * Return: The updated size.
+ */
 static u64 __init
-__e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
+__e820__handle_intersected_range_update(struct e820_table *table,
+					u64 start,
+					u64 size,
+					struct e820_entry *entry,
+					const struct e820_entry_updater *updater,
+					const void *data)
 {
 	u64 end;
-	unsigned int i;
-	u64 real_updated_size = 0;
-
-	BUG_ON(old_type == new_type);
+	u64 entry_end = entry->addr + entry->size;
+	u64 inner_start;
+	u64 inner_end;
+	u64 updated_size = 0;
 
 	if (size > (ULLONG_MAX - start))
 		size = ULLONG_MAX - start;
 
 	end = start + size;
-	printk(KERN_DEBUG "e820: update [mem %#010Lx-%#010Lx] ", start, end - 1);
-	e820_print_type(old_type);
-	pr_cont(" ==> ");
-	e820_print_type(new_type);
-	pr_cont("\n");
-
-	for (i = 0; i < table->nr_entries; i++) {
-		struct e820_entry *entry = &table->entries[i];
-		u64 final_start, final_end;
-		u64 entry_end;
+	inner_start = max(start, entry->addr);
+	inner_end = min(end, entry_end);
+
+	/* Range and entry do intersect and... */
+	if (inner_start < inner_end) {
+		/* Entry is on the left */
+		if (entry->addr < inner_start) {
+			/* Resize current entry */
+			entry->size = inner_start - entry->addr;
+		/* Entry is on the right */
+		} else {
+			/* Resize and move current section */
+			entry->addr = inner_end;
+			entry->size = entry_end - inner_end;
+		}
+		/* Create new entry with intersected region */
+		updater->new(table, inner_start, inner_end - inner_start, entry, data);
 
-		if (entry->type != old_type)
-			continue;
+		updated_size += inner_end - inner_start;
+	} /* Else: [start, end) doesn't cover entry */
 
-		entry_end = entry->addr + entry->size;
+	return updated_size;
+}
 
-		/* Completely covered by new range? */
-		if (entry->addr >= start && entry_end <= end) {
-			entry->type = new_type;
-			real_updated_size += entry->size;
-			continue;
-		}
+/** __e820__handle_range_update(): Helper function to update a address
+ * range in a e820_table
+ * @table: e820_table that we want to modify.
+ * @start: Start of the range.
+ * @size: Size of the range.
+ * @updater: Callbacks to modify the table.
+ * @data: Information to modify the table.
+ *
+ * Update the table @table in [@start, @start + @size) doing the
+ * actions given in @updater.
+ *
+ * Return: The updated size.
+ */
+static u64 __init
+__e820__handle_range_update(struct e820_table *table,
+			    u64 start,
+			    u64 size,
+			    const struct e820_entry_updater *updater,
+			    const void *data)
+{
+	u64 updated_size = 0;
+	u64 end;
+	unsigned int i;
 
-		/* New range is completely covered? */
-		if (entry->addr < start && entry_end > end) {
-			__e820__range_add(table, start, size, new_type);
-			__e820__range_add(table, end, entry_end - end, entry->type);
-			entry->size = start - entry->addr;
-			real_updated_size += size;
-			continue;
-		}
+	if (size > (ULLONG_MAX - start))
+		size = ULLONG_MAX - start;
 
-		/* Partially covered: */
-		final_start = max(start, entry->addr);
-		final_end = min(end, entry_end);
-		if (final_start >= final_end)
-			continue;
+	end = start + size;
 
-		__e820__range_add(table, final_start, final_end - final_start, new_type);
+	for (i = 0; i < table->nr_entries; i++) {
+		struct e820_entry *entry = &table->entries[i];
+		u64 entry_end = entry->addr + entry->size;
+
+		if (updater->should_update(data, entry)) {
+			/* Range completely covers entry */
+			if (entry->addr >= start && entry_end <= end) {
+				updater->update(entry, data);
+				updated_size += entry->size;
+			/* Entry completely covers range */
+			} else if (start > entry->addr && end < entry_end) {
+				/* Resize current entry */
+				entry->size = start - entry->addr;
+
+				/* Create new entry with intersection region */
+				updater->new(table, start, size, entry, data);
+
+				/*
+				 * Create a new entry for the leftover
+				 * of the current entry
+				 */
+				__e820__range_add(table, end, entry_end - end,
+						  entry->type);
+
+				updated_size += size;
+			} else {
+				updated_size =
+					__e820__handle_intersected_range_update(table, start, size,
+										entry, updater, data);
+			}
+		}
+	}
 
-		real_updated_size += final_end - final_start;
+	return updated_size;
+}
 
-		/*
-		 * Left range could be head or tail, so need to update
-		 * its size first:
-		 */
-		entry->size -= final_end - final_start;
-		if (entry->addr < final_start)
-			continue;
+static bool __init type_updater__should_update(const struct e820_entry *entry,
+					       const void *data)
+{
+	struct e820_type_updater_data *type_updater_data =
+		(struct e820_type_updater_data *)data;
 
-		entry->addr = final_end;
-	}
-	return real_updated_size;
+	return entry->type == type_updater_data->old_type;
 }
 
-u64 __init e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
+static void __init type_updater__update(struct e820_entry *entry,
+					const void *data)
 {
-	return __e820__range_update(e820_table, start, size, old_type, new_type);
+	struct e820_type_updater_data *type_updater_data =
+		(struct e820_type_updater_data *)data;
+
+	entry->type = type_updater_data->new_type;
 }
 
-static u64 __init e820__range_update_kexec(u64 start, u64 size, enum e820_type old_type, enum e820_type  new_type)
+static void __init type_updater__new(struct e820_table *table, u64 new_start,
+				     u64 new_size,
+				     const struct e820_entry *original,
+				     const void *data)
 {
-	return __e820__range_update(e820_table_kexec, start, size, old_type, new_type);
+	struct e820_type_updater_data *type_updater_data =
+		(struct e820_type_updater_data *)data;
+
+	__e820__range_add(table, new_start, new_size,
+			  type_updater_data->new_type);
 }
 
-/* Remove a range of memory from the E820 table: */
-u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
+static u64 __init __e820__range_update(struct e820_table *table, u64 start,
+				       u64 size, enum e820_type old_type,
+				       enum e820_type new_type)
 {
-	int i;
-	u64 end;
-	u64 real_removed_size = 0;
+	struct e820_entry_updater updater = {
+		.should_update = type_updater__should_update,
+		.update = type_updater__update,
+		.new = type_updater__new
+	};
 
-	if (size > (ULLONG_MAX - start))
-		size = ULLONG_MAX - start;
+	struct e820_type_updater_data data = {
+		.old_type = old_type,
+		.new_type = new_type
+	};
 
-	end = start + size;
-	printk(KERN_DEBUG "e820: remove [mem %#010Lx-%#010Lx] ", start, end - 1);
-	if (check_type)
-		e820_print_type(old_type);
+	BUG_ON(old_type == new_type);
+
+	printk(KERN_DEBUG "e820: update [mem %#018Lx-%#018Lx] ", start,
+	       start + size - 1);
+	e820_print_type(old_type);
+	pr_cont(" ==> ");
+	e820_print_type(new_type);
 	pr_cont("\n");
 
-	for (i = 0; i < e820_table->nr_entries; i++) {
-		struct e820_entry *entry = &e820_table->entries[i];
-		u64 final_start, final_end;
-		u64 entry_end;
+	return __e820__handle_range_update(table, start, size, &updater, &data);
+}
 
-		if (check_type && entry->type != old_type)
-			continue;
+static bool __init remover__should_update(const struct e820_entry *entry,
+					  const void *data)
+{
+	struct e820_remover_data *remover_data =
+		(struct e820_remover_data *)data;
 
-		entry_end = entry->addr + entry->size;
+	return !remover_data->check_type ||
+	       entry->type == remover_data->old_type;
+}
 
-		/* Completely covered? */
-		if (entry->addr >= start && entry_end <= end) {
-			real_removed_size += entry->size;
-			memset(entry, 0, sizeof(*entry));
-			continue;
-		}
+static void __init remover__update(struct e820_entry *entry, const void *data)
+{
+	memset(entry, 0, sizeof(*entry));
+}
 
-		/* Is the new range completely covered? */
-		if (entry->addr < start && entry_end > end) {
-			e820__range_add(end, entry_end - end, entry->type);
-			entry->size = start - entry->addr;
-			real_removed_size += size;
-			continue;
-		}
+static void __init remover__new(struct e820_table *table, u64 new_start,
+				u64 new_size, const struct e820_entry *original,
+				const void *data)
+{
+}
 
-		/* Partially covered: */
-		final_start = max(start, entry->addr);
-		final_end = min(end, entry_end);
-		if (final_start >= final_end)
-			continue;
+/**
+ * e820__range_remove() - Remove an address range from e820_table.
+ * @start: Start of the address range.
+ * @size: Size of the address range.
+ * @old_type: Type of the entries that we want to remove.
+ * @check_type: Bool to decide if ignore @old_type or not.
+ *
+ * Remove [@start, @start + @size) from e820_table. If @check_type is
+ * true remove only entries with type @old_type.
+ *
+ * Return: The size removed.
+ */
+u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type,
+			      bool check_type)
+{
+	struct e820_entry_updater updater = {
+		.should_update = remover__should_update,
+		.update = remover__update,
+		.new = remover__new
+	};
+
+	struct e820_remover_data data = {
+		.check_type = check_type,
+		.old_type = old_type
+	};
+
+	printk(KERN_DEBUG "e820: remove [mem %#018Lx-%#018Lx] ", start,
+	       start + size - 1);
+	if (check_type)
+		e820_print_type(old_type);
+	pr_cont("\n");
 
-		real_removed_size += final_end - final_start;
+	return __e820__handle_range_update(e820_table, start, size, &updater,
+					    &data);
+}
 
-		/*
-		 * Left range could be head or tail, so need to update
-		 * the size first:
-		 */
-		entry->size -= final_end - final_start;
-		if (entry->addr < final_start)
-			continue;
+/**
+ * e820__range_update() - Update the type of a given address range in
+ * e820_table.
+ * @start: Start of the range.
+ * @size: Size of the range.
+ * @old_type: Type that we want to change.
+ * @new_type: New type to replace @old_type.
+ *
+ * Update type of addresses in [@start, @start + @size) from @old_type
+ * to @new_type in e820_table.
+ *
+ * Return: The size updated.
+ */
+u64 __init e820__range_update(u64 start, u64 size, enum e820_type old_type,
+			      enum e820_type new_type)
+{
+	return __e820__range_update(e820_table, start, size, old_type, new_type);
+}
 
-		entry->addr = final_end;
-	}
-	return real_removed_size;
+/**
+ * e820__range_update_kexec() - Update the type of a given address
+ * range in e820_table_kexec.
+ * @start: Start of the range.
+ * @size: Size of the range.
+ * @old_type: Type that we want to change.
+ * @new_type: New type to replace @old_type.
+ *
+ * Update type of addresses in [@start, @start + @size) from @old_type
+ * to @new_type in e820_table_kexec.
+ *
+ * Return: The size updated.
+ */
+static u64 __init e820__range_update_kexec(u64 start, u64 size,
+					   enum e820_type old_type,
+					   enum e820_type new_type)
+{
+	return __e820__range_update(e820_table_kexec, start, size, old_type, new_type);
 }
 
 void __init e820__update_table_print(void)
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities
  2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
                   ` (2 preceding siblings ...)
  2022-02-03 16:43 ` [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove Martin Fernandez
@ 2022-02-03 16:43 ` Martin Fernandez
  2022-02-07 21:56   ` Kees Cook
  2022-02-03 16:43 ` [PATCH v6 5/6] x86/efi: Tag e820_entries as crypto capable from EFI memmap Martin Fernandez
  2022-02-03 16:43 ` [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities Martin Fernandez
  5 siblings, 1 reply; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

Add a new enum for crypto capabilities.

Add a new member in e820_entry to hold whether an entry is able to do
hardware memory encryption or not.

Add a new function e820__range_set_crypto_capable to mark all the
entries in a range of addresses as encryptable. This will be called
when initializing EFI.

Change e820__update_table to handle merging and overlap problems
taking into account crypto_capable.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 arch/x86/include/asm/e820/api.h   |   1 +
 arch/x86/include/asm/e820/types.h |  12 +++-
 arch/x86/kernel/e820.c            | 114 ++++++++++++++++++++++++++++--
 3 files changed, 119 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index e8f58ddd06d9..4b3b01fafdd1 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -17,6 +17,7 @@ extern bool e820__mapped_all(u64 start, u64 end, enum e820_type type);
 extern void e820__range_add   (u64 start, u64 size, enum e820_type type);
 extern u64  e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
 extern u64  e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
+extern u64  e820__range_set_crypto_capable(u64 start, u64 size);
 
 extern void e820__print_table(char *who);
 extern int  e820__update_table(struct e820_table *table);
diff --git a/arch/x86/include/asm/e820/types.h b/arch/x86/include/asm/e820/types.h
index 314f75d886d0..aef03c665f5e 100644
--- a/arch/x86/include/asm/e820/types.h
+++ b/arch/x86/include/asm/e820/types.h
@@ -46,6 +46,11 @@ enum e820_type {
 	E820_TYPE_RESERVED_KERN	= 128,
 };
 
+enum e820_crypto_capabilities {
+	E820_NOT_CRYPTO_CAPABLE	= 0,
+	E820_CRYPTO_CAPABLE	= 1,
+};
+
 /*
  * A single E820 map entry, describing a memory range of [addr...addr+size-1],
  * of 'type' memory type:
@@ -53,9 +58,10 @@ enum e820_type {
  * (We pack it because there can be thousands of them on large systems.)
  */
 struct e820_entry {
-	u64			addr;
-	u64			size;
-	enum e820_type		type;
+	u64				addr;
+	u64				size;
+	enum e820_type			type;
+	enum e820_crypto_capabilities	crypto_capable;
 } __attribute__((packed));
 
 /*
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 89b78c6b345b..098882d02120 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -163,7 +163,9 @@ int e820__get_entry_type(u64 start, u64 end)
 /*
  * Add a memory region to the kernel E820 map.
  */
-static void __init __e820__range_add(struct e820_table *table, u64 start, u64 size, enum e820_type type)
+static void __init __e820__range_add(struct e820_table *table, u64 start,
+				     u64 size, enum e820_type type,
+				     enum e820_crypto_capabilities crypto_capable)
 {
 	int x = table->nr_entries;
 
@@ -176,12 +178,13 @@ static void __init __e820__range_add(struct e820_table *table, u64 start, u64 si
 	table->entries[x].addr = start;
 	table->entries[x].size = size;
 	table->entries[x].type = type;
+	table->entries[x].crypto_capable = crypto_capable;
 	table->nr_entries++;
 }
 
 void __init e820__range_add(u64 start, u64 size, enum e820_type type)
 {
-	__e820__range_add(e820_table, start, size, type);
+	__e820__range_add(e820_table, start, size, type, E820_NOT_CRYPTO_CAPABLE);
 }
 
 static void __init e820_print_type(enum e820_type type)
@@ -211,6 +214,8 @@ void __init e820__print_table(char *who)
 			e820_table->entries[i].addr + e820_table->entries[i].size - 1);
 
 		e820_print_type(e820_table->entries[i].type);
+		if (e820_table->entries[i].crypto_capable == E820_CRYPTO_CAPABLE)
+			pr_cont("; crypto-capable");
 		pr_cont("\n");
 	}
 }
@@ -327,6 +332,7 @@ int __init e820__update_table(struct e820_table *table)
 	unsigned long long last_addr;
 	u32 new_nr_entries, overlap_entries;
 	u32 i, chg_idx, chg_nr;
+	enum e820_crypto_capabilities current_crypto, last_crypto;
 
 	/* If there's only one memory region, don't bother: */
 	if (table->nr_entries < 2)
@@ -367,6 +373,7 @@ int __init e820__update_table(struct e820_table *table)
 	new_nr_entries = 0;	 /* Index for creating new map entries */
 	last_type = 0;		 /* Start with undefined memory type */
 	last_addr = 0;		 /* Start with 0 as last starting address */
+	last_crypto = E820_NOT_CRYPTO_CAPABLE;
 
 	/* Loop through change-points, determining effect on the new map: */
 	for (chg_idx = 0; chg_idx < chg_nr; chg_idx++) {
@@ -388,13 +395,19 @@ int __init e820__update_table(struct e820_table *table)
 		 * 1=usable, 2,3,4,4+=unusable)
 		 */
 		current_type = 0;
+		current_crypto = E820_CRYPTO_CAPABLE;
 		for (i = 0; i < overlap_entries; i++) {
+			if (overlap_list[i]->crypto_capable < current_crypto)
+				current_crypto = overlap_list[i]->crypto_capable;
+
 			if (overlap_list[i]->type > current_type)
 				current_type = overlap_list[i]->type;
 		}
 
 		/* Continue building up new map based on this information: */
-		if (current_type != last_type || e820_nomerge(current_type)) {
+		if (current_type != last_type ||
+		    current_crypto != last_crypto ||
+		    e820_nomerge(current_type)) {
 			if (last_type != 0)	 {
 				new_entries[new_nr_entries].size = change_point[chg_idx]->addr - last_addr;
 				/* Move forward only if the new size was non-zero: */
@@ -406,9 +419,12 @@ int __init e820__update_table(struct e820_table *table)
 			if (current_type != 0)	{
 				new_entries[new_nr_entries].addr = change_point[chg_idx]->addr;
 				new_entries[new_nr_entries].type = current_type;
+				new_entries[new_nr_entries].crypto_capable = current_crypto;
+
 				last_addr = change_point[chg_idx]->addr;
 			}
 			last_type = current_type;
+			last_crypto = current_crypto;
 		}
 	}
 
@@ -505,6 +521,19 @@ struct e820_type_updater_data {
 	enum e820_type new_type;
 };
 
+/**
+ * e820_crypto_updater_data - Helper type for
+ * __e820__range_update_crypto().
+ * @crypto_capable: crypto_capable parameter of
+ * __e820__range_update_crypto().
+ *
+ * This is intended to be used as the @data argument for the
+ * e820_entry_updater callbacks.
+ */
+struct e820_crypto_updater_data {
+	enum e820_crypto_capabilities crypto_capable;
+};
+
 /**
  * __e820__handle_intersected_range_update() - Helper function for
  * __e820__handle_range_update().
@@ -615,7 +644,8 @@ __e820__handle_range_update(struct e820_table *table,
 				 * of the current entry
 				 */
 				__e820__range_add(table, end, entry_end - end,
-						  entry->type);
+						  entry->type,
+						  entry->crypto_capable);
 
 				updated_size += size;
 			} else {
@@ -656,7 +686,7 @@ static void __init type_updater__new(struct e820_table *table, u64 new_start,
 		(struct e820_type_updater_data *)data;
 
 	__e820__range_add(table, new_start, new_size,
-			  type_updater_data->new_type);
+			  type_updater_data->new_type, original->crypto_capable);
 }
 
 static u64 __init __e820__range_update(struct e820_table *table, u64 start,
@@ -686,6 +716,62 @@ static u64 __init __e820__range_update(struct e820_table *table, u64 start,
 	return __e820__handle_range_update(table, start, size, &updater, &data);
 }
 
+static bool __init crypto_updater__should_update(const struct e820_entry *entry,
+						 const void *data)
+{
+	struct e820_crypto_updater_data *crypto_updater_data =
+		(struct e820_crypto_updater_data *)data;
+
+	return crypto_updater_data->crypto_capable != entry->crypto_capable;
+}
+
+static void __init crypto_updater__update(struct e820_entry *entry,
+					  const void *data)
+{
+	struct e820_crypto_updater_data *crypto_updater_data =
+		(struct e820_crypto_updater_data *)data;
+
+	entry->crypto_capable = crypto_updater_data->crypto_capable;
+}
+
+static void __init crypto_updater__new(struct e820_table *table, u64 new_start,
+				       u64 new_size,
+				       const struct e820_entry *original,
+				       const void *data)
+{
+	struct e820_crypto_updater_data *crypto_updater_data =
+		(struct e820_crypto_updater_data *)data;
+
+	__e820__range_add(table, new_start, new_size, original->type,
+			  crypto_updater_data->crypto_capable);
+}
+
+static u64 __init
+__e820__range_update_crypto(struct e820_table *table, u64 start, u64 size,
+			    enum e820_crypto_capabilities crypto_capable)
+{
+	struct e820_entry_updater updater = {
+		.should_update = crypto_updater__should_update,
+		.update = crypto_updater__update,
+		.new = crypto_updater__new
+	};
+
+	struct e820_crypto_updater_data data = {
+		.crypto_capable = crypto_capable,
+	};
+
+	printk(KERN_DEBUG "e820: crypto update [mem %#018Lx-%#018Lx]", start,
+	       start + size - 1);
+	pr_cont(" ==> ");
+	if (crypto_capable == E820_CRYPTO_CAPABLE)
+		pr_cont("crypto capable");
+	else
+		pr_cont("not crypto capable");
+	pr_cont("\n");
+
+	return __e820__handle_range_update(table, start, size, &updater, &data);
+}
+
 static bool __init remover__should_update(const struct e820_entry *entry,
 					  const void *data)
 {
@@ -782,6 +868,22 @@ static u64 __init e820__range_update_kexec(u64 start, u64 size,
 	return __e820__range_update(e820_table_kexec, start, size, old_type, new_type);
 }
 
+/**
+ * e820__range_set_crypto_capable() - Set %E820_CRYPTO_CAPABLE to a
+ * given range of addresses in e820_table.
+ * @start: Start of the range.
+ * @size: Size of the range.
+ *
+ * Set %E820_CRYPTO_CAPABLE to [@start, @start + @size) in e820_table.
+ *
+ * Return: The size updated.
+ */
+u64 __init e820__range_set_crypto_capable(u64 start, u64 size)
+{
+	return __e820__range_update_crypto(e820_table, start, size,
+					   E820_CRYPTO_CAPABLE);
+}
+
 void __init e820__update_table_print(void)
 {
 	if (e820__update_table(e820_table))
@@ -1505,6 +1607,8 @@ void __init e820__memblock_setup(void)
 			continue;
 
 		memblock_add(entry->addr, entry->size);
+		if (entry->crypto_capable == E820_CRYPTO_CAPABLE)
+			memblock_mark_crypto_capable(entry->addr, entry->size);
 	}
 
 	/* Throw away partial pages: */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 5/6] x86/efi: Tag e820_entries as crypto capable from EFI memmap
  2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
                   ` (3 preceding siblings ...)
  2022-02-03 16:43 ` [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities Martin Fernandez
@ 2022-02-03 16:43 ` Martin Fernandez
  2022-02-03 16:43 ` [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities Martin Fernandez
  5 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

Add a function to iterate over the EFI Memory Map and mark the regions
tagged with EFI_MEMORY_CPU_CRYPTO in the e820_table; and call it from
efi_init if add_efi_memmap is disabled.

Also modify do_add_efi_memmap to mark the regions there.

If add_efi_memmap is false, also check that the e820_table has enough
size to (possibly) store also the EFI memmap.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 arch/x86/platform/efi/efi.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 147c30a81f15..3efa1c620c75 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -184,6 +184,8 @@ static void __init do_add_efi_memmap(void)
 		}
 
 		e820__range_add(start, size, e820_type);
+		if (md->attribute & EFI_MEMORY_CPU_CRYPTO)
+			e820__range_set_crypto_capable(start, size);
 	}
 	e820__update_table(e820_table);
 }
@@ -441,6 +443,34 @@ static int __init efi_config_init(const efi_config_table_type_t *arch_tables)
 	return ret;
 }
 
+static void __init efi_mark_e820_regions_as_crypto_capable(void)
+{
+	efi_memory_desc_t *md;
+
+	/*
+	 * Calling e820__range_set_crypto_capable several times
+	 * creates a bunch of entries in the E820 table. They probably
+	 * will get merged when calling update_table but we need the
+	 * space there anyway
+	 */
+	if (efi.memmap.nr_map + e820_table->nr_entries >= E820_MAX_ENTRIES) {
+		pr_err_once("E820 table is not large enough to fit EFI memmap; not marking entries as crypto capable\n");
+		return;
+	}
+
+	for_each_efi_memory_desc(md) {
+		if (md->attribute & EFI_MEMORY_CPU_CRYPTO)
+			e820__range_set_crypto_capable(md->phys_addr,
+						       md->num_pages << EFI_PAGE_SHIFT);
+	}
+
+	/*
+	 * We added and modified regions so it's good to update the
+	 * table to merge/sort
+	 */
+	e820__update_table(e820_table);
+}
+
 void __init efi_init(void)
 {
 	if (IS_ENABLED(CONFIG_X86_32) &&
@@ -494,6 +524,13 @@ void __init efi_init(void)
 	set_bit(EFI_RUNTIME_SERVICES, &efi.flags);
 	efi_clean_memmap();
 
+	/*
+	 * If add_efi_memmap then there is no need to mark the regions
+	 * again
+	 */
+	if (!add_efi_memmap)
+		efi_mark_e820_regions_as_crypto_capable();
+
 	if (efi_enabled(EFI_DBG))
 		efi_print_memmap();
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
                   ` (4 preceding siblings ...)
  2022-02-03 16:43 ` [PATCH v6 5/6] x86/efi: Tag e820_entries as crypto capable from EFI memmap Martin Fernandez
@ 2022-02-03 16:43 ` Martin Fernandez
  2022-02-04  3:47   ` Limonciello, Mario
  2022-02-04  4:56   ` Mike Rapoport
  5 siblings, 2 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 16:43 UTC (permalink / raw)
  To: linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Martin Fernandez

Show in each node in sysfs if its memory is able to do be encrypted by
the CPU, ie. if all its memory is marked with EFI_MEMORY_CPU_CRYPTO in
the EFI memory map.

Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
---
 Documentation/ABI/testing/sysfs-devices-node | 10 ++++++++++
 drivers/base/node.c                          | 10 ++++++++++
 2 files changed, 20 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-devices-node

diff --git a/Documentation/ABI/testing/sysfs-devices-node b/Documentation/ABI/testing/sysfs-devices-node
new file mode 100644
index 000000000000..0d1fd86c9faf
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-node
@@ -0,0 +1,10 @@
+What:		/sys/devices/system/node/nodeX/crypto_capable
+Date:		February 2022
+Contact:	Martin Fernandez <martin.fernandez@eclypsium.com>
+Users:		fwupd (https://fwupd.org)
+Description:
+		This value is 1 if all system memory in this node is
+		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the
+		system memory is capable of being protected with the
+		CPU’s memory cryptographic capabilities. It is 0
+		otherwise.
\ No newline at end of file
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 87acc47e8951..dabaed997ecd 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -560,11 +560,21 @@ static ssize_t node_read_distance(struct device *dev,
 }
 static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);
 
+static ssize_t crypto_capable_show(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	struct pglist_data *pgdat = NODE_DATA(dev->id);
+
+	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
+}
+static DEVICE_ATTR_RO(crypto_capable);
+
 static struct attribute *node_dev_attrs[] = {
 	&dev_attr_meminfo.attr,
 	&dev_attr_numastat.attr,
 	&dev_attr_distance.attr,
 	&dev_attr_vmstat.attr,
+	&dev_attr_crypto_capable.attr,
 	NULL
 };
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities
  2022-02-03 16:43 ` [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities Martin Fernandez
@ 2022-02-03 18:07   ` Mike Rapoport
  2022-02-03 18:24     ` Martin Fernandez
  2022-02-07 21:18   ` Kees Cook
  1 sibling, 1 reply; 37+ messages in thread
From: Mike Rapoport @ 2022-02-03 18:07 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On Thu, Feb 03, 2022 at 01:43:23PM -0300, Martin Fernandez wrote:
> Add the capability to mark regions of the memory memory_type able of
> hardware memory encryption.
> 
> Also add the capability to query if all regions of a memory node are
> able to do hardware memory encryption to call it when initializing the
> nodes. Warn the user if a node has both encryptable and
> non-encryptable regions.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>  include/linux/memblock.h | 15 ++++++----
>  mm/memblock.c            | 64 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 74 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 9dc7cb239d21..73edcce165a5 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -41,13 +41,15 @@ extern unsigned long long max_possible_pfn;
>   * via a driver, and never indicated in the firmware-provided memory map as
>   * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the
>   * kernel resource tree.
> + * @MEMBLOCK_CRYPTO_CAPABLE: capable of hardware encryption
>   */
>  enum memblock_flags {
> -	MEMBLOCK_NONE		= 0x0,	/* No special request */
> -	MEMBLOCK_HOTPLUG	= 0x1,	/* hotpluggable region */
> -	MEMBLOCK_MIRROR		= 0x2,	/* mirrored region */
> -	MEMBLOCK_NOMAP		= 0x4,	/* don't add to kernel direct mapping */
> -	MEMBLOCK_DRIVER_MANAGED = 0x8,	/* always detected via a driver */
> +	MEMBLOCK_NONE		= 0x0,		/* No special request */
> +	MEMBLOCK_HOTPLUG	= 0x1,		/* hotpluggable region */
> +	MEMBLOCK_MIRROR		= 0x2,		/* mirrored region */
> +	MEMBLOCK_NOMAP		= 0x4,		/* don't add to kernel direct mapping */
> +	MEMBLOCK_DRIVER_MANAGED = 0x8,		/* always detected via a driver */
> +	MEMBLOCK_CRYPTO_CAPABLE = 0x10,		/* capable of hardware encryption */

Please keep the comment indentation.

>  };
>  
>  /**
> @@ -121,6 +123,9 @@ int memblock_physmem_add(phys_addr_t base, phys_addr_t size);
>  void memblock_trim_memory(phys_addr_t align);
>  bool memblock_overlaps_region(struct memblock_type *type,
>  			      phys_addr_t base, phys_addr_t size);
> +bool memblock_node_is_crypto_capable(int nid);
> +int memblock_mark_crypto_capable(phys_addr_t base, phys_addr_t size);
> +int memblock_clear_crypto_capable(phys_addr_t base, phys_addr_t size);
>  int memblock_mark_hotplug(phys_addr_t base, phys_addr_t size);
>  int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
>  int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 1018e50566f3..fcf79befeab3 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -191,6 +191,42 @@ bool __init_memblock memblock_overlaps_region(struct memblock_type *type,
>  	return i < type->cnt;
>  }
>  
> +/**
> + * memblock_node_is_crypto_capable - get if whole node is capable
> + * of encryption
> + * @nid: number of node
> + *
> + * Iterate over all memory memblock_type and find if all regions under
> + * node @nid are capable of hardware encryption.
> + *
> + * Return:
> + * true if every region in memory memblock_type is capable of
> + * encryption, false otherwise.
> + */
> +bool __init_memblock memblock_node_is_crypto_capable(int nid)
> +{
> +	struct memblock_region *region;
> +	bool crypto_capable = false;
> +	bool not_crypto_capable = false;
> +
> +	for_each_mem_region(region) {
> +		if (memblock_get_region_node(region) == nid) {
> +			crypto_capable =
> +				crypto_capable ||
> +				(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
> +			not_crypto_capable =
> +				not_crypto_capable ||
> +				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);

Isn't

 			if (region->flags & MEMBLOCK_CRYPTO_CAPABLE)
				crypto_capable++;
			else
				not_crypto_capable++;

simpler and clearer?

(of course s/bool/int in the declaration)

> +		}
> +	}
> +
> +	if (crypto_capable && not_crypto_capable)
> +		pr_warn_once("Node %d has regions that are encryptable and regions that aren't",
> +			     nid);

This will print only the first node with mixed regions. With a single
caller of memblock_node_is_crypto_capable() I think pr_warn() is ok.

> +
> +	return !not_crypto_capable;
> +}
> +
>  /**
>   * __memblock_find_range_bottom_up - find free area utility in bottom-up
>   * @start: start of candidate range
> @@ -885,6 +921,34 @@ static int __init_memblock memblock_setclr_flag(phys_addr_t base,
>  	return 0;
>  }
>  
> +/**
> + * memblock_mark_crypto_capable - Mark memory regions capable of hardware
> + * encryption with flag MEMBLOCK_CRYPTO_CAPABLE.
> + * @base: the base phys addr of the region
> + * @size: the size of the region
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int __init_memblock memblock_mark_crypto_capable(phys_addr_t base,
> +						 phys_addr_t size)
> +{
> +	return memblock_setclr_flag(base, size, 1, MEMBLOCK_CRYPTO_CAPABLE);
> +}
> +
> +/**
> + * memblock_clear_crypto_capable - Clear flag MEMBLOCK_CRYPTO for a
> + * specified region.
> + * @base: the base phys addr of the region
> + * @size: the size of the region
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int __init_memblock memblock_clear_crypto_capable(phys_addr_t base,
> +						  phys_addr_t size)
> +{
> +	return memblock_setclr_flag(base, size, 0, MEMBLOCK_CRYPTO_CAPABLE);
> +}
> +
>  /**
>   * memblock_mark_hotplug - Mark hotpluggable memory with flag MEMBLOCK_HOTPLUG.
>   * @base: the base phys addr of the region
> -- 
> 2.30.2
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities
  2022-02-03 18:07   ` Mike Rapoport
@ 2022-02-03 18:24     ` Martin Fernandez
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-03 18:24 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On 2/3/22, Mike Rapoport <rppt@kernel.org> wrote:
> On Thu, Feb 03, 2022 at 01:43:23PM -0300, Martin Fernandez wrote:
>> +/**
>> + * memblock_node_is_crypto_capable - get if whole node is capable
>> + * of encryption
>> + * @nid: number of node
>> + *
>> + * Iterate over all memory memblock_type and find if all regions under
>> + * node @nid are capable of hardware encryption.
>> + *
>> + * Return:
>> + * true if every region in memory memblock_type is capable of
>> + * encryption, false otherwise.
>> + */
>> +bool __init_memblock memblock_node_is_crypto_capable(int nid)
>> +{
>> +	struct memblock_region *region;
>> +	bool crypto_capable = false;
>> +	bool not_crypto_capable = false;
>> +
>> +	for_each_mem_region(region) {
>> +		if (memblock_get_region_node(region) == nid) {
>> +			crypto_capable =
>> +				crypto_capable ||
>> +				(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
>> +			not_crypto_capable =
>> +				not_crypto_capable ||
>> +				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
>
> Isn't
>
>  			if (region->flags & MEMBLOCK_CRYPTO_CAPABLE)
> 				crypto_capable++;
> 			else
> 				not_crypto_capable++;
>
> simpler and clearer?
>
> (of course s/bool/int in the declaration)
>

Yes! It is. I like that.

>> +		}
>> +	}
>> +
>> +	if (crypto_capable && not_crypto_capable)
>> +		pr_warn_once("Node %d has regions that are encryptable and regions that
>> aren't",
>> +			     nid);
>
> This will print only the first node with mixed regions. With a single
> caller of memblock_node_is_crypto_capable() I think pr_warn() is ok.
>

Yes, you are correct, don't really want _once here.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-03 16:43 ` [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities Martin Fernandez
@ 2022-02-04  3:47   ` Limonciello, Mario
  2022-02-04 13:21     ` Martin Fernandez
  2022-02-04  4:56   ` Mike Rapoport
  1 sibling, 1 reply; 37+ messages in thread
From: Limonciello, Mario @ 2022-02-04  3:47 UTC (permalink / raw)
  To: Martin Fernandez, linux-kernel, linux-efi, platform-driver-x86, linux-mm
  Cc: tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook, Lendacky, Thomas

On 2/3/2022 10:43, Martin Fernandez wrote:
> Show in each node in sysfs if its memory is able to do be encrypted by
> the CPU, ie. if all its memory is marked with EFI_MEMORY_CPU_CRYPTO in
> the EFI memory map.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>   Documentation/ABI/testing/sysfs-devices-node | 10 ++++++++++
>   drivers/base/node.c                          | 10 ++++++++++
>   2 files changed, 20 insertions(+)
>   create mode 100644 Documentation/ABI/testing/sysfs-devices-node
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-node b/Documentation/ABI/testing/sysfs-devices-node
> new file mode 100644
> index 000000000000..0d1fd86c9faf
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-devices-node
> @@ -0,0 +1,10 @@
> +What:		/sys/devices/system/node/nodeX/crypto_capable
> +Date:		February 2022
> +Contact:	Martin Fernandez <martin.fernandez@eclypsium.com>
> +Users:		fwupd (https://fwupd.org)
> +Description:
> +		This value is 1 if all system memory in this node is
> +		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the
> +		system memory is capable of being protected with the
> +		CPU’s memory cryptographic capabilities. It is 0
> +		otherwise.
> \ No newline at end of file
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 87acc47e8951..dabaed997ecd 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -560,11 +560,21 @@ static ssize_t node_read_distance(struct device *dev,
>   }
>   static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);
>   
> +static ssize_t crypto_capable_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
> +
> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);

As there is interest in seeing these capabilities from userspace, it 
seems like a logical time to also expose a `crypto_active` attribute.

Then userspace can make a judgement call if the system supports crypto 
memory (`crypto_capable`) and then also whether or not it's been turned 
on (`crypto_active`).

`crypto_active` could be detected with some existing support in the 
kernel of `mem_encrypt_active()`.  This will then work for a variety of 
architectures too that offer `mem_encrypt_active()`.

As it stands today the only reliable way to tell from userspace (at 
least for AMD's x86 implementation) is by grepping the system log for 
the line "AMD Memory Encryption Features active".

> +}
> +static DEVICE_ATTR_RO(crypto_capable);
> +
>   static struct attribute *node_dev_attrs[] = {
>   	&dev_attr_meminfo.attr,
>   	&dev_attr_numastat.attr,
>   	&dev_attr_distance.attr,
>   	&dev_attr_vmstat.attr,
> +	&dev_attr_crypto_capable.attr,
>   	NULL
>   };
>   


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-03 16:43 ` [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities Martin Fernandez
  2022-02-04  3:47   ` Limonciello, Mario
@ 2022-02-04  4:56   ` Mike Rapoport
  2022-02-04 12:27     ` Martin Fernandez
  1 sibling, 1 reply; 37+ messages in thread
From: Mike Rapoport @ 2022-02-04  4:56 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On Thu, Feb 03, 2022 at 01:43:28PM -0300, Martin Fernandez wrote:
> Show in each node in sysfs if its memory is able to do be encrypted by
> the CPU, ie. if all its memory is marked with EFI_MEMORY_CPU_CRYPTO in
> the EFI memory map.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>  Documentation/ABI/testing/sysfs-devices-node | 10 ++++++++++
>  drivers/base/node.c                          | 10 ++++++++++
>  2 files changed, 20 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-devices-node
> 
> diff --git a/Documentation/ABI/testing/sysfs-devices-node b/Documentation/ABI/testing/sysfs-devices-node
> new file mode 100644
> index 000000000000..0d1fd86c9faf
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-devices-node
> @@ -0,0 +1,10 @@
> +What:		/sys/devices/system/node/nodeX/crypto_capable
> +Date:		February 2022
> +Contact:	Martin Fernandez <martin.fernandez@eclypsium.com>
> +Users:		fwupd (https://fwupd.org)
> +Description:
> +		This value is 1 if all system memory in this node is
> +		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the

It didn't jump at me at previous postings, but other architectures won't
necessary have EFI_MEMORY_CPU_CRYPTO marking crypto-capable memory. 

How about

  This value is 1 if all system memory in this node is capable of being
  protected with the CPU's memory cryptographic capabilities. It is 0
  otherwise.
  On EFI architectures with value corresponds to EFI_MEMORY_CPU_CRYPTO.


> +		system memory is capable of being protected with the
> +		CPU’s memory cryptographic capabilities. It is 0
> +		otherwise.
> \ No newline at end of file
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 87acc47e8951..dabaed997ecd 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -560,11 +560,21 @@ static ssize_t node_read_distance(struct device *dev,
>  }
>  static DEVICE_ATTR(distance, 0444, node_read_distance, NULL);
>  
> +static ssize_t crypto_capable_show(struct device *dev,
> +				   struct device_attribute *attr, char *buf)
> +{
> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
> +
> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
> +}
> +static DEVICE_ATTR_RO(crypto_capable);
> +
>  static struct attribute *node_dev_attrs[] = {
>  	&dev_attr_meminfo.attr,
>  	&dev_attr_numastat.attr,
>  	&dev_attr_distance.attr,
>  	&dev_attr_vmstat.attr,
> +	&dev_attr_crypto_capable.attr,
>  	NULL
>  };
>  
> -- 
> 2.30.2
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04  4:56   ` Mike Rapoport
@ 2022-02-04 12:27     ` Martin Fernandez
  2022-02-04 13:37       ` Mike Rapoport
  0 siblings, 1 reply; 37+ messages in thread
From: Martin Fernandez @ 2022-02-04 12:27 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On 2/4/22, Mike Rapoport <rppt@kernel.org> wrote:
> On Thu, Feb 03, 2022 at 01:43:28PM -0300, Martin Fernandez wrote:
>> +Description:
>> +		This value is 1 if all system memory in this node is
>> +		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the
>
> It didn't jump at me at previous postings, but other architectures won't
> necessary have EFI_MEMORY_CPU_CRYPTO marking crypto-capable memory.
>
> How about
>
>   This value is 1 if all system memory in this node is capable of being
>   protected with the CPU's memory cryptographic capabilities. It is 0
>   otherwise.
>   On EFI architectures with value corresponds to EFI_MEMORY_CPU_CRYPTO.
>
>

Yes, sounds good to me.

Is there other architecture with something similar to this? Or are you
thinking on the possibility of such architecture?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04  3:47   ` Limonciello, Mario
@ 2022-02-04 13:21     ` Martin Fernandez
  2022-02-04 15:59       ` Tom Lendacky
  0 siblings, 1 reply; 37+ messages in thread
From: Martin Fernandez @ 2022-02-04 13:21 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook, Lendacky, Thomas

On 2/4/22, Limonciello, Mario <mario.limonciello@amd.com> wrote:
> On 2/3/2022 10:43, Martin Fernandez wrote:
>> +static ssize_t crypto_capable_show(struct device *dev,
>> +				   struct device_attribute *attr, char *buf)
>> +{
>> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
>> +
>> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
>
> As there is interest in seeing these capabilities from userspace, it
> seems like a logical time to also expose a `crypto_active` attribute.

I planned to do something similar to this, but to show (or actually
hide if inactive) tme in cpuinfo, just as Borislav Petkov suggested a
few versions back.

https://lore.kernel.org/linux-efi/YXrnkxgdjWbcPlJA@zn.tnic/

> Then userspace can make a judgement call if the system supports crypto
> memory (`crypto_capable`) and then also whether or not it's been turned
> on (`crypto_active`).
>
> `crypto_active` could be detected with some existing support in the
> kernel of `mem_encrypt_active()`.  This will then work for a variety of
> architectures too that offer `mem_encrypt_active()`.

I need a hand with this, I grepped for mem_encrypt_active and nothing
showed up...

> As it stands today the only reliable way to tell from userspace (at
> least for AMD's x86 implementation) is by grepping the system log for
> the line "AMD Memory Encryption Features active".

Isn't enough to grep for sme/sev in cpuinfo?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 12:27     ` Martin Fernandez
@ 2022-02-04 13:37       ` Mike Rapoport
  0 siblings, 0 replies; 37+ messages in thread
From: Mike Rapoport @ 2022-02-04 13:37 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On Fri, Feb 04, 2022 at 09:27:42AM -0300, Martin Fernandez wrote:
> On 2/4/22, Mike Rapoport <rppt@kernel.org> wrote:
> > On Thu, Feb 03, 2022 at 01:43:28PM -0300, Martin Fernandez wrote:
> >> +Description:
> >> +		This value is 1 if all system memory in this node is
> >> +		marked with EFI_MEMORY_CPU_CRYPTO, indicating that the
> >
> > It didn't jump at me at previous postings, but other architectures won't
> > necessary have EFI_MEMORY_CPU_CRYPTO marking crypto-capable memory.
> >
> > How about
> >
> >   This value is 1 if all system memory in this node is capable of being
> >   protected with the CPU's memory cryptographic capabilities. It is 0
> >   otherwise.
> >   On EFI architectures with value corresponds to EFI_MEMORY_CPU_CRYPTO.
> >
> >
> 
> Yes, sounds good to me.
> 
> Is there other architecture with something similar to this? Or are you
> thinking on the possibility of such architecture?

AFAIU, s390 and powerpc have memory encryption capabilities, I don't know
the details though. 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 13:21     ` Martin Fernandez
@ 2022-02-04 15:59       ` Tom Lendacky
  2022-02-04 16:23         ` Limonciello, Mario
  0 siblings, 1 reply; 37+ messages in thread
From: Tom Lendacky @ 2022-02-04 15:59 UTC (permalink / raw)
  To: Martin Fernandez, Limonciello, Mario
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On 2/4/22 07:21, Martin Fernandez wrote:
> On 2/4/22, Limonciello, Mario <mario.limonciello@amd.com> wrote:
>> On 2/3/2022 10:43, Martin Fernandez wrote:
>>> +static ssize_t crypto_capable_show(struct device *dev,
>>> +				   struct device_attribute *attr, char *buf)
>>> +{
>>> +	struct pglist_data *pgdat = NODE_DATA(dev->id);
>>> +
>>> +	return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
>>
>> As there is interest in seeing these capabilities from userspace, it
>> seems like a logical time to also expose a `crypto_active` attribute.
> 
> I planned to do something similar to this, but to show (or actually
> hide if inactive) tme in cpuinfo, just as Borislav Petkov suggested a
> few versions back.
> 
> https://lore.kernel.org/linux-efi/YXrnkxgdjWbcPlJA@zn.tnic/
> 
>> Then userspace can make a judgement call if the system supports crypto
>> memory (`crypto_capable`) and then also whether or not it's been turned
>> on (`crypto_active`).
>>
>> `crypto_active` could be detected with some existing support in the
>> kernel of `mem_encrypt_active()`.  This will then work for a variety of
>> architectures too that offer `mem_encrypt_active()`.
> 
> I need a hand with this, I grepped for mem_encrypt_active and nothing
> showed up...

The mem_encrypt_active() function has been replaced by 
cc_platform_has(CC_ATTR_MEM_ENCRYPT).

> 
>> As it stands today the only reliable way to tell from userspace (at
>> least for AMD's x86 implementation) is by grepping the system log for
>> the line "AMD Memory Encryption Features active".
> 
> Isn't enough to grep for sme/sev in cpuinfo?

No, it's not enough. Cpuinfo shows a processors capabilities and not 
necessarily whether that capability is being used.

Thanks,
Tom

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 15:59       ` Tom Lendacky
@ 2022-02-04 16:23         ` Limonciello, Mario
  2022-02-04 16:28           ` Borislav Petkov
  0 siblings, 1 reply; 37+ messages in thread
From: Limonciello, Mario @ 2022-02-04 16:23 UTC (permalink / raw)
  To: Tom Lendacky, Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield, keescook

On 2/4/2022 09:59, Tom Lendacky wrote:
> On 2/4/22 07:21, Martin Fernandez wrote:
>> On 2/4/22, Limonciello, Mario <mario.limonciello@amd.com> wrote:
>>> On 2/3/2022 10:43, Martin Fernandez wrote:
>>>> +static ssize_t crypto_capable_show(struct device *dev,
>>>> +                   struct device_attribute *attr, char *buf)
>>>> +{
>>>> +    struct pglist_data *pgdat = NODE_DATA(dev->id);
>>>> +
>>>> +    return sysfs_emit(buf, "%d\n", pgdat->crypto_capable);
>>>
>>> As there is interest in seeing these capabilities from userspace, it
>>> seems like a logical time to also expose a `crypto_active` attribute.
>>
>> I planned to do something similar to this, but to show (or actually
>> hide if inactive) tme in cpuinfo, just as Borislav Petkov suggested a
>> few versions back.
>>
>> https://lore.kernel.org/linux-efi/YXrnkxgdjWbcPlJA@zn.tnic/

As Tom agreed in previous post, Boris is mistaken here.  I just double 
checked on my side on a workstation that supports SME and comparing 
/proc/cpuinfo before and after SME is enabled via mem_encrypt=on.  I 
confirmed that nothing changed.

>>
>>> Then userspace can make a judgement call if the system supports crypto
>>> memory (`crypto_capable`) and then also whether or not it's been turned
>>> on (`crypto_active`).
>>>
>>> `crypto_active` could be detected with some existing support in the
>>> kernel of `mem_encrypt_active()`.  This will then work for a variety of
>>> architectures too that offer `mem_encrypt_active()`.
>>
>> I need a hand with this, I grepped for mem_encrypt_active and nothing
>> showed up...
> 
> The mem_encrypt_active() function has been replaced by 
> cc_platform_has(CC_ATTR_MEM_ENCRYPT).

Yes, thanks for correcting it .

> 
>>
>>> As it stands today the only reliable way to tell from userspace (at
>>> least for AMD's x86 implementation) is by grepping the system log for
>>> the line "AMD Memory Encryption Features active".
>>
>> Isn't enough to grep for sme/sev in cpuinfo?
> 
> No, it's not enough. Cpuinfo shows a processors capabilities and not 
> necessarily whether that capability is being used.
> 
> Thanks,
> Tom

Tom,

Maybe some sysfs file(s) directly from cc_platform.c makes more sense then?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 16:23         ` Limonciello, Mario
@ 2022-02-04 16:28           ` Borislav Petkov
  2022-02-04 17:12             ` Tom Lendacky
  2022-02-07  3:39             ` Kees Cook
  0 siblings, 2 replies; 37+ messages in thread
From: Borislav Petkov @ 2022-02-04 16:28 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Tom Lendacky, Martin Fernandez, linux-kernel, linux-efi,
	platform-driver-x86, linux-mm, tglx, mingo, dave.hansen, x86,
	hpa, ardb, dvhart, andy, gregkh, rafael, rppt, akpm,
	daniel.gutson, hughsient, alex.bazhaniuk, alison.schofield,
	keescook

On Fri, Feb 04, 2022 at 10:23:22AM -0600, Limonciello, Mario wrote:
> > > > As there is interest in seeing these capabilities from userspace, it

This needs to be explained in a lot more detail: why, what is going to
use it, how, etc.

We don't do user-visible APIs just because.

> As Tom agreed in previous post, Boris is mistaken here.  I just double
> checked on my side on a workstation that supports SME and comparing
> /proc/cpuinfo before and after SME is enabled via mem_encrypt=on.  I
> confirmed that nothing changed.

Then we should clear that "sme" flag if memory encryption is not
enabled. Like we do for all other flags.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 16:28           ` Borislav Petkov
@ 2022-02-04 17:12             ` Tom Lendacky
  2022-02-04 17:49               ` Limonciello, Mario
  2022-02-04 18:00               ` Borislav Petkov
  2022-02-07  3:39             ` Kees Cook
  1 sibling, 2 replies; 37+ messages in thread
From: Tom Lendacky @ 2022-02-04 17:12 UTC (permalink / raw)
  To: Borislav Petkov, Limonciello, Mario
  Cc: Martin Fernandez, linux-kernel, linux-efi, platform-driver-x86,
	linux-mm, tglx, mingo, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook

On 2/4/22 10:28, Borislav Petkov wrote:
> On Fri, Feb 04, 2022 at 10:23:22AM -0600, Limonciello, Mario wrote:
>>>>> As there is interest in seeing these capabilities from userspace, it
> 
> This needs to be explained in a lot more detail: why, what is going to
> use it, how, etc.
> 
> We don't do user-visible APIs just because.
> 
>> As Tom agreed in previous post, Boris is mistaken here.  I just double
>> checked on my side on a workstation that supports SME and comparing
>> /proc/cpuinfo before and after SME is enabled via mem_encrypt=on.  I
>> confirmed that nothing changed.
> 
> Then we should clear that "sme" flag if memory encryption is not
> enabled. Like we do for all other flags.

If we do that, then this will have to be re-worked:

https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/process.c#L761

Thanks,
Tom

> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 17:12             ` Tom Lendacky
@ 2022-02-04 17:49               ` Limonciello, Mario
  2022-02-04 18:00               ` Borislav Petkov
  1 sibling, 0 replies; 37+ messages in thread
From: Limonciello, Mario @ 2022-02-04 17:49 UTC (permalink / raw)
  To: Tom Lendacky, Borislav Petkov
  Cc: Martin Fernandez, linux-kernel, linux-efi, platform-driver-x86,
	linux-mm, tglx, mingo, dave.hansen, x86, hpa, ardb, dvhart, andy,
	gregkh, rafael, rppt, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield, keescook

On 2/4/2022 11:12, Tom Lendacky wrote:
> On 2/4/22 10:28, Borislav Petkov wrote:
>> On Fri, Feb 04, 2022 at 10:23:22AM -0600, Limonciello, Mario wrote:
>>>>>> As there is interest in seeing these capabilities from userspace, it
>>
>> This needs to be explained in a lot more detail: why, what is going to
>> use it, how, etc.
>>
>> We don't do user-visible APIs just because.

The fwupd daemon has a feature that measures various security aspects of 
the system hardware, software and firmware and reflects it out to 
consumers (fwupd clients) in an easily consumable format, in some cases 
with actionable notes.

In this case the information would be used to make a check about memory 
encryption support and enablement.  If a sysfs file was made then it 
could be something like this:

1) fwupd checks /sys/security/memory_encryption
1: You're encrypted, here's a gold star.
0: keep checking

2) fwupd checks does /proc/cpuinfo have sme, sev_es, or mktme?
No: Your hardware doesn't support encryption, tell the user.
Yes: keep going.
3)AMD?
    Check /proc/cmdline, Did user set mem_encrypt=off on explicitly? 
That's why. Tell user they can enable it with mem_encrypt=on or 
CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT
    mem_encrypt=on/CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT?
    We've got a kernel or hardware problem.

4) Intel?
    Document Intel's path to turn it on.

>>
>>> As Tom agreed in previous post, Boris is mistaken here.  I just double
>>> checked on my side on a workstation that supports SME and comparing
>>> /proc/cpuinfo before and after SME is enabled via mem_encrypt=on.  I
>>> confirmed that nothing changed.
>>
>> Then we should clear that "sme" flag if memory encryption is not
>> enabled. Like we do for all other flags.
> 
> If we do that, then this will have to be re-worked:
> 
> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/process.c#L761 
> 

I guess if sme/sev/sev_es "are" torn out of cpuinfo when encryption is 
turned off then that "could" instead do the MSR read perhaps?

> 
> Thanks,
> Tom
> 
>>


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 17:12             ` Tom Lendacky
  2022-02-04 17:49               ` Limonciello, Mario
@ 2022-02-04 18:00               ` Borislav Petkov
  2022-02-04 18:49                 ` Tom Lendacky
  1 sibling, 1 reply; 37+ messages in thread
From: Borislav Petkov @ 2022-02-04 18:00 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Limonciello, Mario, Martin Fernandez, linux-kernel, linux-efi,
	platform-driver-x86, linux-mm, tglx, mingo, dave.hansen, x86,
	hpa, ardb, dvhart, andy, gregkh, rafael, rppt, akpm,
	daniel.gutson, hughsient, alex.bazhaniuk, alison.schofield,
	keescook

On Fri, Feb 04, 2022 at 11:12:04AM -0600, Tom Lendacky wrote:
> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/process.c#L761

For those who won't open a browser just to see what he means :), that's
this snippet:

void stop_this_cpu(void *dummy):
	/*
	 * Use wbinvd on processors that support SME. This provides support
	 * for performing a successful kexec when going from SME inactive
	 * to SME active (or vice-versa). The cache must be cleared so that
	 * if there are entries with the same physical address, both with and
	 * without the encryption bit, they don't race each other when flushed
	 * and potentially end up with the wrong entry being committed to
	 * memory.
	 */
	if (boot_cpu_has(X86_FEATURE_SME))
		native_wbinvd();


Well, we do clear our *representation* of CPUID flags for other features
- see output of

$ git grep -E "(setup_)?clear_cpu_cap"

for examples. We do that for SME even: early_detect_mem_encrypt().

Which means, since this needs to be "processors that support SME", this
line should change to:

	/* ... test the CPUID bit directly because the machine might've cleared
	 * X86_FEATURE_SME due to cmdline options.
	 */
	if (cpuid_eax(0x8000001f) & BIT(0))
		native_wbinvd();

I'd say...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 18:00               ` Borislav Petkov
@ 2022-02-04 18:49                 ` Tom Lendacky
  2022-02-04 21:49                   ` Borislav Petkov
  0 siblings, 1 reply; 37+ messages in thread
From: Tom Lendacky @ 2022-02-04 18:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Limonciello, Mario, Martin Fernandez, linux-kernel, linux-efi,
	platform-driver-x86, linux-mm, tglx, mingo, dave.hansen, x86,
	hpa, ardb, dvhart, andy, gregkh, rafael, rppt, akpm,
	daniel.gutson, hughsient, alex.bazhaniuk, alison.schofield,
	keescook

On 2/4/22 12:00, Borislav Petkov wrote:
> On Fri, Feb 04, 2022 at 11:12:04AM -0600, Tom Lendacky wrote:
>> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/process.c#L761
> 
> For those who won't open a browser just to see what he means :), that's
> this snippet:
> 
> void stop_this_cpu(void *dummy):
> 	/*
> 	 * Use wbinvd on processors that support SME. This provides support
> 	 * for performing a successful kexec when going from SME inactive
> 	 * to SME active (or vice-versa). The cache must be cleared so that
> 	 * if there are entries with the same physical address, both with and
> 	 * without the encryption bit, they don't race each other when flushed
> 	 * and potentially end up with the wrong entry being committed to
> 	 * memory.
> 	 */
> 	if (boot_cpu_has(X86_FEATURE_SME))
> 		native_wbinvd();
> 
> 
> Well, we do clear our *representation* of CPUID flags for other features
> - see output of
> 
> $ git grep -E "(setup_)?clear_cpu_cap"
> 
> for examples. We do that for SME even: early_detect_mem_encrypt().
> 
> Which means, since this needs to be "processors that support SME", this
> line should change to:
> 
> 	/* ... test the CPUID bit directly because the machine might've cleared
> 	 * X86_FEATURE_SME due to cmdline options.
> 	 */
> 	if (cpuid_eax(0x8000001f) & BIT(0))
> 		native_wbinvd();
> 
> I'd say...

Yep, and that should be safe. We would have to look at the generated code 
as there can't be any memory stores after the native_wbinvd() and before 
the native_halt().

Thanks,
Tom

> 

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 18:49                 ` Tom Lendacky
@ 2022-02-04 21:49                   ` Borislav Petkov
  0 siblings, 0 replies; 37+ messages in thread
From: Borislav Petkov @ 2022-02-04 21:49 UTC (permalink / raw)
  To: Tom Lendacky
  Cc: Limonciello, Mario, Martin Fernandez, linux-kernel, linux-efi,
	platform-driver-x86, linux-mm, tglx, mingo, dave.hansen, x86,
	hpa, ardb, dvhart, andy, gregkh, rafael, rppt, akpm,
	daniel.gutson, hughsient, alex.bazhaniuk, alison.schofield,
	keescook

On Fri, Feb 04, 2022 at 12:49:08PM -0600, Tom Lendacky wrote:
> Yep, and that should be safe. We would have to look at the generated code as
> there can't be any memory stores after the native_wbinvd() and before the
> native_halt().

I don't think anything else changes here besides the CPUID. Rest of the
asm is the same.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-04 16:28           ` Borislav Petkov
  2022-02-04 17:12             ` Tom Lendacky
@ 2022-02-07  3:39             ` Kees Cook
  2022-02-07 10:02               ` Borislav Petkov
  1 sibling, 1 reply; 37+ messages in thread
From: Kees Cook @ 2022-02-07  3:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Limonciello, Mario, Tom Lendacky, Martin Fernandez, linux-kernel,
	linux-efi, platform-driver-x86, linux-mm, tglx, mingo,
	dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh, rafael, rppt,
	akpm, daniel.gutson, hughsient, alex.bazhaniuk, alison.schofield

On Fri, Feb 04, 2022 at 05:28:43PM +0100, Borislav Petkov wrote:
> Then we should clear that "sme" flag if memory encryption is not
> enabled. Like we do for all other flags.

Oh, this seems weird to me, as I'd expect it to show up since the CPU is
_capable_ of it, even if it's not in use. (Am I really using avx512vl,
e.g.?)

But as you point out later, it does work that way for a lot of things
and boot params. If this is the way things are supposed to be done,
it looks like we should wire up "nx" vs "noexec=off" boot param to do
the same (separate from this series), though it would need special
care since that bit needs very very early handling both and boot
and resume. Maybe kernel/cpu/common.c should check for _PAGE_NX in
__supported_pte_mask? (And would that break KVM's NX, etc?)

Hmmm.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities
  2022-02-07  3:39             ` Kees Cook
@ 2022-02-07 10:02               ` Borislav Petkov
  0 siblings, 0 replies; 37+ messages in thread
From: Borislav Petkov @ 2022-02-07 10:02 UTC (permalink / raw)
  To: Kees Cook
  Cc: Limonciello, Mario, Tom Lendacky, Martin Fernandez, linux-kernel,
	linux-efi, platform-driver-x86, linux-mm, tglx, mingo,
	dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh, rafael, rppt,
	akpm, daniel.gutson, hughsient, alex.bazhaniuk, alison.schofield

On Sun, Feb 06, 2022 at 07:39:46PM -0800, Kees Cook wrote:
> Oh, this seems weird to me, as I'd expect it to show up since the CPU is
> _capable_ of it, even if it's not in use. (Am I really using avx512vl,
> e.g.?)

We're trying to put feature flags in /proc/cpuinfo which mean that the
kernel supports the feature - not every CPUID bit out there. For that
there's tools/arch/x86/kcpuid/kcpuid.c

Otherwise /proc/cpuinfo becomes a dumping ground for feature flags and
there's no shortage of those.

> But as you point out later, it does work that way for a lot of things
> and boot params. If this is the way things are supposed to be done,
> it looks like we should wire up "nx" vs "noexec=off" boot param to do

See here:

https://lore.kernel.org/r/20220127115626.14179-1-bp@alien8.de

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities
  2022-02-03 16:43 ` [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities Martin Fernandez
  2022-02-03 18:07   ` Mike Rapoport
@ 2022-02-07 21:18   ` Kees Cook
  2022-02-08 14:39     ` Martin Fernandez
  1 sibling, 1 reply; 37+ messages in thread
From: Kees Cook @ 2022-02-07 21:18 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On Thu, Feb 03, 2022 at 01:43:23PM -0300, Martin Fernandez wrote:
> Add the capability to mark regions of the memory memory_type able of
> hardware memory encryption.
> 
> Also add the capability to query if all regions of a memory node are
> able to do hardware memory encryption to call it when initializing the
> nodes. Warn the user if a node has both encryptable and
> non-encryptable regions.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>  include/linux/memblock.h | 15 ++++++----
>  mm/memblock.c            | 64 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 74 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 9dc7cb239d21..73edcce165a5 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -41,13 +41,15 @@ extern unsigned long long max_possible_pfn;
>   * via a driver, and never indicated in the firmware-provided memory map as
>   * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the
>   * kernel resource tree.
> + * @MEMBLOCK_CRYPTO_CAPABLE: capable of hardware encryption
>   */
>  enum memblock_flags {
> -	MEMBLOCK_NONE		= 0x0,	/* No special request */
> -	MEMBLOCK_HOTPLUG	= 0x1,	/* hotpluggable region */
> -	MEMBLOCK_MIRROR		= 0x2,	/* mirrored region */
> -	MEMBLOCK_NOMAP		= 0x4,	/* don't add to kernel direct mapping */
> -	MEMBLOCK_DRIVER_MANAGED = 0x8,	/* always detected via a driver */
> +	MEMBLOCK_NONE		= 0x0,		/* No special request */
> +	MEMBLOCK_HOTPLUG	= 0x1,		/* hotpluggable region */
> +	MEMBLOCK_MIRROR		= 0x2,		/* mirrored region */
> +	MEMBLOCK_NOMAP		= 0x4,		/* don't add to kernel direct mapping */
> +	MEMBLOCK_DRIVER_MANAGED = 0x8,		/* always detected via a driver */
> +	MEMBLOCK_CRYPTO_CAPABLE = 0x10,		/* capable of hardware encryption */

As already suggested, please keep the tabs like they were. If you're
going to change every line, maybe expand the single-digit literals to 2
digits. (i.e. 0x0 -> 0x00, to keep the most significant bits lined up.)

>  };
>  
>  /**
> @@ -121,6 +123,9 @@ int memblock_physmem_add(phys_addr_t base, phys_addr_t size);
>  void memblock_trim_memory(phys_addr_t align);
>  bool memblock_overlaps_region(struct memblock_type *type,
>  			      phys_addr_t base, phys_addr_t size);
> +bool memblock_node_is_crypto_capable(int nid);
> +int memblock_mark_crypto_capable(phys_addr_t base, phys_addr_t size);
> +int memblock_clear_crypto_capable(phys_addr_t base, phys_addr_t size);
>  int memblock_mark_hotplug(phys_addr_t base, phys_addr_t size);
>  int memblock_clear_hotplug(phys_addr_t base, phys_addr_t size);
>  int memblock_mark_mirror(phys_addr_t base, phys_addr_t size);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index 1018e50566f3..fcf79befeab3 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -191,6 +191,42 @@ bool __init_memblock memblock_overlaps_region(struct memblock_type *type,
>  	return i < type->cnt;
>  }
>  
> +/**
> + * memblock_node_is_crypto_capable - get if whole node is capable
> + * of encryption
> + * @nid: number of node
> + *
> + * Iterate over all memory memblock_type and find if all regions under
> + * node @nid are capable of hardware encryption.
> + *
> + * Return:
> + * true if every region in memory memblock_type is capable of
> + * encryption, false otherwise.
> + */
> +bool __init_memblock memblock_node_is_crypto_capable(int nid)
> +{
> +	struct memblock_region *region;
> +	bool crypto_capable = false;
> +	bool not_crypto_capable = false;
> +
> +	for_each_mem_region(region) {
> +		if (memblock_get_region_node(region) == nid) {
> +			crypto_capable =
> +				crypto_capable ||
> +				(region->flags & MEMBLOCK_CRYPTO_CAPABLE);

This was already mentioned, but I just thought I'd add: this made me
double-take, given the "||" (instead of "|") in an assignment. It looked
like a typo, but yes it's correct. I was expecting something like:

			crypto_capable |=
				!!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);

> +			not_crypto_capable =
> +				not_crypto_capable ||
> +				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);

			not_crypto_capable |=
				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);

> +		}
> +	}
> +
> +	if (crypto_capable && not_crypto_capable)
> +		pr_warn_once("Node %d has regions that are encryptable and regions that aren't",
> +			     nid);
> +
> +	return !not_crypto_capable;
> +}
> +
>  /**
>   * __memblock_find_range_bottom_up - find free area utility in bottom-up
>   * @start: start of candidate range
> @@ -885,6 +921,34 @@ static int __init_memblock memblock_setclr_flag(phys_addr_t base,
>  	return 0;
>  }
>  
> +/**
> + * memblock_mark_crypto_capable - Mark memory regions capable of hardware
> + * encryption with flag MEMBLOCK_CRYPTO_CAPABLE.
> + * @base: the base phys addr of the region
> + * @size: the size of the region
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int __init_memblock memblock_mark_crypto_capable(phys_addr_t base,
> +						 phys_addr_t size)
> +{
> +	return memblock_setclr_flag(base, size, 1, MEMBLOCK_CRYPTO_CAPABLE);
> +}
> +
> +/**
> + * memblock_clear_crypto_capable - Clear flag MEMBLOCK_CRYPTO for a
> + * specified region.
> + * @base: the base phys addr of the region
> + * @size: the size of the region
> + *
> + * Return: 0 on success, -errno on failure.
> + */
> +int __init_memblock memblock_clear_crypto_capable(phys_addr_t base,
> +						  phys_addr_t size)
> +{
> +	return memblock_setclr_flag(base, size, 0, MEMBLOCK_CRYPTO_CAPABLE);
> +}
> +
>  /**
>   * memblock_mark_hotplug - Mark hotpluggable memory with flag MEMBLOCK_HOTPLUG.
>   * @base: the base phys addr of the region
> -- 
> 2.30.2
> 

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 2/6] mm/mmzone: Tag pg_data_t with crypto capabilities
  2022-02-03 16:43 ` [PATCH v6 2/6] mm/mmzone: Tag pg_data_t " Martin Fernandez
@ 2022-02-07 21:19   ` Kees Cook
  0 siblings, 0 replies; 37+ messages in thread
From: Kees Cook @ 2022-02-07 21:19 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On Thu, Feb 03, 2022 at 01:43:24PM -0300, Martin Fernandez wrote:
> Add a new member in the pg_data_t struct tell whether the node
> corresponding to that pg_data_t is able to do hardware memory encryption.
> 
> This will be read from sysfs.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>

Seems reasonable, and doesn't grow the structure size. :)

Reviewed-by: Kees Cook <keescook@chromium.org>

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-03 16:43 ` [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove Martin Fernandez
@ 2022-02-07 21:45   ` Kees Cook
  2022-02-08  8:40     ` Mike Rapoport
                       ` (2 more replies)
  2022-02-08 21:04   ` Daniel Gutson
  1 sibling, 3 replies; 37+ messages in thread
From: Kees Cook @ 2022-02-07 21:45 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On Thu, Feb 03, 2022 at 01:43:25PM -0300, Martin Fernandez wrote:
> __e820__range_update and e820__range_remove had a very similar
> implementation with a few lines different from each other, the lines
> that actually perform the modification over the e820_table. The
> similiraties were found in the checks for the different cases on how
> each entry intersects with the given range (if it does at all). These
> checks were very presice and error prone so it was not a good idea to
> have them in both places.

Yay removing copy/paste code! :)

> 
> I propose a refactor of those functions, given that I need to create a
> similar one for this patchset.

The diff here is pretty hard (for me) to review; I'll need more time
to check it. What might make review easier (at least for me), is to
incrementally change these routines. i.e. separate patches to:

- add the new infrastructure
- replace e820__range_remove
- replace __e820__range_update

If that's not actually useful, no worries. I'll just stare at it a bit
more. :)

> 
> Add a function to modify a E820 table in a given range. This
> modification is done backed up by two helper structs:
> e820_entry_updater and e820_*_data.
> 
> The first one, e820_entry_updater, carries 3 callbacks which function
> as the actions to take on the table.
> 
> The other one, e820_*_data carries information needed by the
> callbacks, for example in the case of range_update it will carry the
> type that we are targeting.

Something I think would be really amazing here is if you could add KUnit
tests here to exercise the corner cases and validate the changes. It
should be pretty easy to add. Here's a quick example for the boilerplate
and testing a bit of __e820__range_add():

#ifdef CONFIG_E820_KUNIT_TEST
#include <kunit/test.h>

static void __init test_e820_range_add(struct kunit *context)
{
	struct e820_table table;
	u32 full;

	full = ARRAY_SIZE(table.entries);
	/* Add last entry. */
	table->nr_entries = full - 1;
	__e820__range_add(&table, 0, 15, 0);
	KUNIT_EXPECT_EQ(table->nr_entries, full)
	/* Skip new entry when full. */
	__e820__range_add(&table, 0, 15, 0);
	KUNIT_EXPECT_EQ(table->nr_entries, full)
}

static void __init test_e820_update(struct kunit *context)
{
...
}

static struct kunit_case __refdata e820_test_cases[] = {
        KUNIT_CASE(test_e820_range_add),
        KUNIT_CASE(test_e820_update),
	...
        {}
};

static struct kunit_suite e820_test_suite = {
        .name = "e820",
        .test_cases = e820_test_cases,
};

kunit_test_suites(&e820_test_suite);
#endif

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities
  2022-02-03 16:43 ` [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities Martin Fernandez
@ 2022-02-07 21:56   ` Kees Cook
  2022-02-08 14:46     ` Martin Fernandez
  0 siblings, 1 reply; 37+ messages in thread
From: Kees Cook @ 2022-02-07 21:56 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On Thu, Feb 03, 2022 at 01:43:26PM -0300, Martin Fernandez wrote:
> Add a new enum for crypto capabilities.
> 
> Add a new member in e820_entry to hold whether an entry is able to do
> hardware memory encryption or not.
> 
> Add a new function e820__range_set_crypto_capable to mark all the
> entries in a range of addresses as encryptable. This will be called
> when initializing EFI.
> 
> Change e820__update_table to handle merging and overlap problems
> taking into account crypto_capable.
> 
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>  arch/x86/include/asm/e820/api.h   |   1 +
>  arch/x86/include/asm/e820/types.h |  12 +++-
>  arch/x86/kernel/e820.c            | 114 ++++++++++++++++++++++++++++--
>  3 files changed, 119 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
> index e8f58ddd06d9..4b3b01fafdd1 100644
> --- a/arch/x86/include/asm/e820/api.h
> +++ b/arch/x86/include/asm/e820/api.h
> @@ -17,6 +17,7 @@ extern bool e820__mapped_all(u64 start, u64 end, enum e820_type type);
>  extern void e820__range_add   (u64 start, u64 size, enum e820_type type);
>  extern u64  e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
>  extern u64  e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
> +extern u64  e820__range_set_crypto_capable(u64 start, u64 size);
>  
>  extern void e820__print_table(char *who);
>  extern int  e820__update_table(struct e820_table *table);
> diff --git a/arch/x86/include/asm/e820/types.h b/arch/x86/include/asm/e820/types.h
> index 314f75d886d0..aef03c665f5e 100644
> --- a/arch/x86/include/asm/e820/types.h
> +++ b/arch/x86/include/asm/e820/types.h
> @@ -46,6 +46,11 @@ enum e820_type {
>  	E820_TYPE_RESERVED_KERN	= 128,
>  };
>  
> +enum e820_crypto_capabilities {
> +	E820_NOT_CRYPTO_CAPABLE	= 0,
> +	E820_CRYPTO_CAPABLE	= 1,
> +};

Is this expected to grow beyond a bool?

> +
>  /*
>   * A single E820 map entry, describing a memory range of [addr...addr+size-1],
>   * of 'type' memory type:
> @@ -53,9 +58,10 @@ enum e820_type {
>   * (We pack it because there can be thousands of them on large systems.)
>   */
>  struct e820_entry {
> -	u64			addr;
> -	u64			size;
> -	enum e820_type		type;
> +	u64				addr;
> +	u64				size;
> +	enum e820_type			type;
> +	enum e820_crypto_capabilities	crypto_capable;
>  } __attribute__((packed));

Is there any concern about growing this structure? The "thousands" note
in the comment is likely rare. FWIW, this seems fine to me, but I
thought I'd mention it.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-07 21:45   ` Kees Cook
@ 2022-02-08  8:40     ` Mike Rapoport
  2022-02-08 21:01       ` Martin Fernandez
  2022-02-08 21:09     ` Martin Fernandez
  2022-03-04 20:32     ` Martin Fernandez
  2 siblings, 1 reply; 37+ messages in thread
From: Mike Rapoport @ 2022-02-08  8:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Martin Fernandez, linux-kernel, linux-efi, platform-driver-x86,
	linux-mm, tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart,
	andy, gregkh, rafael, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield

On Mon, Feb 07, 2022 at 01:45:40PM -0800, Kees Cook wrote:
> On Thu, Feb 03, 2022 at 01:43:25PM -0300, Martin Fernandez wrote:
> > __e820__range_update and e820__range_remove had a very similar
> > implementation with a few lines different from each other, the lines
> > that actually perform the modification over the e820_table. The
> > similiraties were found in the checks for the different cases on how
> > each entry intersects with the given range (if it does at all). These
> > checks were very presice and error prone so it was not a good idea to
> > have them in both places.
> 
> Yay removing copy/paste code! :)

Removing copy/paste is nice but diffstat of

 arch/x86/kernel/e820.c | 383 ++++++++++++++++++++++++++++++-----------
 1 file changed, 283 insertions(+), 100 deletions(-)

does not look nice even accounting for lots of comments :(

I didn't look closely, but diffstat clues that the refactoring making
things much more complex.
 
> > 
> > I propose a refactor of those functions, given that I need to create a
> > similar one for this patchset.
> 
> The diff here is pretty hard (for me) to review; I'll need more time
> to check it. What might make review easier (at least for me), is to
> incrementally change these routines. i.e. separate patches to:
> 
> - add the new infrastructure
> - replace e820__range_remove
> - replace __e820__range_update
> 
> If that's not actually useful, no worries. I'll just stare at it a bit
> more. :)

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities
  2022-02-07 21:18   ` Kees Cook
@ 2022-02-08 14:39     ` Martin Fernandez
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-08 14:39 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On 2/7/22, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Feb 03, 2022 at 01:43:23PM -0300, Martin Fernandez wrote:
>> +/**
>> + * memblock_node_is_crypto_capable - get if whole node is capable
>> + * of encryption
>> + * @nid: number of node
>> + *
>> + * Iterate over all memory memblock_type and find if all regions under
>> + * node @nid are capable of hardware encryption.
>> + *
>> + * Return:
>> + * true if every region in memory memblock_type is capable of
>> + * encryption, false otherwise.
>> + */
>> +bool __init_memblock memblock_node_is_crypto_capable(int nid)
>> +{
>> +	struct memblock_region *region;
>> +	bool crypto_capable = false;
>> +	bool not_crypto_capable = false;
>> +
>> +	for_each_mem_region(region) {
>> +		if (memblock_get_region_node(region) == nid) {
>> +			crypto_capable =
>> +				crypto_capable ||
>> +				(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
>
> This was already mentioned, but I just thought I'd add: this made me
> double-take, given the "||" (instead of "|") in an assignment. It looked
> like a typo, but yes it's correct. I was expecting something like:
>
> 			crypto_capable |=
> 				!!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
>
>> +			not_crypto_capable =
>> +				not_crypto_capable ||
>> +				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
>
> 			not_crypto_capable |=
> 				!(region->flags & MEMBLOCK_CRYPTO_CAPABLE);
>

Yes, this also works. Thanks.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities
  2022-02-07 21:56   ` Kees Cook
@ 2022-02-08 14:46     ` Martin Fernandez
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-08 14:46 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On 2/7/22, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Feb 03, 2022 at 01:43:26PM -0300, Martin Fernandez wrote:
>> Add a new enum for crypto capabilities.
>>
>> Add a new member in e820_entry to hold whether an entry is able to do
>> hardware memory encryption or not.
>>
>> Add a new function e820__range_set_crypto_capable to mark all the
>> entries in a range of addresses as encryptable. This will be called
>> when initializing EFI.
>>
>> Change e820__update_table to handle merging and overlap problems
>> taking into account crypto_capable.
>>
>> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
>> ---
>>  arch/x86/include/asm/e820/api.h   |   1 +
>>  arch/x86/include/asm/e820/types.h |  12 +++-
>>  arch/x86/kernel/e820.c            | 114 ++++++++++++++++++++++++++++--
>>  3 files changed, 119 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/e820/api.h
>> b/arch/x86/include/asm/e820/api.h
>> index e8f58ddd06d9..4b3b01fafdd1 100644
>> --- a/arch/x86/include/asm/e820/api.h
>> +++ b/arch/x86/include/asm/e820/api.h
>> @@ -17,6 +17,7 @@ extern bool e820__mapped_all(u64 start, u64 end, enum
>> e820_type type);
>>  extern void e820__range_add   (u64 start, u64 size, enum e820_type
>> type);
>>  extern u64  e820__range_update(u64 start, u64 size, enum e820_type
>> old_type, enum e820_type new_type);
>>  extern u64  e820__range_remove(u64 start, u64 size, enum e820_type
>> old_type, bool check_type);
>> +extern u64  e820__range_set_crypto_capable(u64 start, u64 size);
>>
>>  extern void e820__print_table(char *who);
>>  extern int  e820__update_table(struct e820_table *table);
>> diff --git a/arch/x86/include/asm/e820/types.h
>> b/arch/x86/include/asm/e820/types.h
>> index 314f75d886d0..aef03c665f5e 100644
>> --- a/arch/x86/include/asm/e820/types.h
>> +++ b/arch/x86/include/asm/e820/types.h
>> @@ -46,6 +46,11 @@ enum e820_type {
>>  	E820_TYPE_RESERVED_KERN	= 128,
>>  };
>>
>> +enum e820_crypto_capabilities {
>> +	E820_NOT_CRYPTO_CAPABLE	= 0,
>> +	E820_CRYPTO_CAPABLE	= 1,
>> +};
>
> Is this expected to grow beyond a bool?
>

People commented that maybe it was a good idea to have the source of
the cryptographic capabilities, in this case that would be the EFI
memmap. So this could grow in that case.

Also the enum makes it self explanatory while using it in the code.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-08  8:40     ` Mike Rapoport
@ 2022-02-08 21:01       ` Martin Fernandez
  2022-02-15  7:10         ` Mike Rapoport
  0 siblings, 1 reply; 37+ messages in thread
From: Martin Fernandez @ 2022-02-08 21:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Kees Cook, linux-kernel, linux-efi, platform-driver-x86,
	linux-mm, tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart,
	andy, gregkh, rafael, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield

On 2/8/22, Mike Rapoport <rppt@kernel.org> wrote:
> On Mon, Feb 07, 2022 at 01:45:40PM -0800, Kees Cook wrote:
>> On Thu, Feb 03, 2022 at 01:43:25PM -0300, Martin Fernandez wrote:
>> > __e820__range_update and e820__range_remove had a very similar
>> > implementation with a few lines different from each other, the lines
>> > that actually perform the modification over the e820_table. The
>> > similiraties were found in the checks for the different cases on how
>> > each entry intersects with the given range (if it does at all). These
>> > checks were very presice and error prone so it was not a good idea to
>> > have them in both places.
>>
>> Yay removing copy/paste code! :)
>
> Removing copy/paste is nice but diffstat of
>
>  arch/x86/kernel/e820.c | 383 ++++++++++++++++++++++++++++++-----------
>  1 file changed, 283 insertions(+), 100 deletions(-)
>
> does not look nice even accounting for lots of comments :(
>
> I didn't look closely, but diffstat clues that the refactoring making
> things much more complex.
>

Yes, that diffstat surprised me as well.

I have to mention that 110 of those lines are kerneldocs and blank
lines, which is quite a lot. Also you have to take into account that I
expanded most of the function definitions for better formatting, which
also took some space.

And as I was able to focus the "hard" part of the problem into a
single function, testing can be done easily as Kees suggested and I'm
planning to do so in the next patch.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-03 16:43 ` [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove Martin Fernandez
  2022-02-07 21:45   ` Kees Cook
@ 2022-02-08 21:04   ` Daniel Gutson
  1 sibling, 0 replies; 37+ messages in thread
From: Daniel Gutson @ 2022-02-08 21:04 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	H. Peter Anvin, ardb, dvhart, andy, Greg Kroah-Hartman, rafael,
	rppt, akpm, Richard Hughes, Alex Bazhaniuk, Alison Schofield,
	Kees Cook

On Thu, Feb 3, 2022 at 1:44 PM Martin Fernandez
<martin.fernandez@eclypsium.com> wrote:
>
> __e820__range_update and e820__range_remove had a very similar
> implementation with a few lines different from each other, the lines
> that actually perform the modification over the e820_table. The
> similiraties were found in the checks for the different cases on how
> each entry intersects with the given range (if it does at all). These
> checks were very presice and error prone so it was not a good idea to
> have them in both places.
>
> I propose a refactor of those functions, given that I need to create a
> similar one for this patchset.
>
> Add a function to modify a E820 table in a given range. This
> modification is done backed up by two helper structs:
> e820_entry_updater and e820_*_data.
>
> The first one, e820_entry_updater, carries 3 callbacks which function
> as the actions to take on the table.
>
> The other one, e820_*_data carries information needed by the
> callbacks, for example in the case of range_update it will carry the
> type that we are targeting.
>
> Signed-off-by: Martin Fernandez <martin.fernandez@eclypsium.com>
> ---
>  arch/x86/kernel/e820.c | 383 ++++++++++++++++++++++++++++++-----------
>  1 file changed, 283 insertions(+), 100 deletions(-)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index bc0657f0deed..89b78c6b345b 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -459,144 +459,327 @@ static int __init append_e820_table(struct boot_e820_entry *entries, u32 nr_entr
>         return __append_e820_table(entries, nr_entries);
>  }
>
> +/**
> + * e820_entry_updater - Helper type for __e820__handle_range_update().
> + * @should_update: Return true if @entry needs to be updated, false
> + * otherwise.
> + * @update: Apply desired actions to an @entry that is inside the
> + * range and satisfies @should_update.
> + * @new: Create new entry in the table with information gathered from
> + * @original and @data.
> + *
> + * Each function corresponds to an action that
> + * __e820__handle_range_update() does. Callbacks need to cast @data back
> + * to the corresponding type.
> + */
> +struct e820_entry_updater {
> +       bool (*should_update)(const struct e820_entry *entry, const void *data);
> +       void (*update)(struct e820_entry *entry, const void *data);
> +       void (*new)(struct e820_table *table, u64 new_start, u64 new_size,
> +                   const struct e820_entry *original, const void *data);
> +};
> +
> +/**
> + * e820_remove_data - Helper type for e820__range_remove().
> + * @old_type: old_type parameter of e820__range_remove().
> + * @check_type: check_type parameter of e820__range_remove().
> + *
> + * This is intended to be used as the @data argument for the
> + * e820_entry_updater callbacks.
> + */
> +struct e820_remover_data {
> +       enum e820_type old_type;
> +       bool check_type;
> +};
> +
> +/**
> + * e820_type_updater_data - Helper type for __e820__range_update().
> + * @old_type: old_type parameter of __e820__range_update().
> + * @new_type: new_type parameter of __e820__range_update().
> + *
> + * This is intended to be used as the @data argument for the
> + * e820_entry_updater callbacks.
> + */
> +struct e820_type_updater_data {
> +       enum e820_type old_type;
> +       enum e820_type new_type;
> +};
> +
> +/**
> + * __e820__handle_intersected_range_update() - Helper function for
> + * __e820__handle_range_update().
> + * @table: Target e820_table.
> + * @start: Start of the range.
> + * @size: Size of the range.
> + * @entry: Current entry that __e820__handle_range_update() was
> + * looking into.
> + * @updater: updater parameter of __e820__handle_range_update().
> + * @data: data parameter of __e820__handle_range_update().
> + *
> + * Helper for __e820__handle_range_update to handle the case where
> + * neither the entry completely covers the range nor the range
> + * completely covers the entry.
> + *
> + * Return: The updated size.
> + */
>  static u64 __init
> -__e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
> +__e820__handle_intersected_range_update(struct e820_table *table,
> +                                       u64 start,
> +                                       u64 size,
> +                                       struct e820_entry *entry,
> +                                       const struct e820_entry_updater *updater,
> +                                       const void *data)
>  {
>         u64 end;
> -       unsigned int i;
> -       u64 real_updated_size = 0;
> -
> -       BUG_ON(old_type == new_type);
> +       u64 entry_end = entry->addr + entry->size;
> +       u64 inner_start;
> +       u64 inner_end;
> +       u64 updated_size = 0;
>
>         if (size > (ULLONG_MAX - start))
>                 size = ULLONG_MAX - start;
>
>         end = start + size;
> -       printk(KERN_DEBUG "e820: update [mem %#010Lx-%#010Lx] ", start, end - 1);
> -       e820_print_type(old_type);
> -       pr_cont(" ==> ");
> -       e820_print_type(new_type);
> -       pr_cont("\n");
> -
> -       for (i = 0; i < table->nr_entries; i++) {
> -               struct e820_entry *entry = &table->entries[i];
> -               u64 final_start, final_end;
> -               u64 entry_end;
> +       inner_start = max(start, entry->addr);
> +       inner_end = min(end, entry_end);
> +
> +       /* Range and entry do intersect and... */
> +       if (inner_start < inner_end) {
> +               /* Entry is on the left */
> +               if (entry->addr < inner_start) {
> +                       /* Resize current entry */
> +                       entry->size = inner_start - entry->addr;
> +               /* Entry is on the right */
> +               } else {
> +                       /* Resize and move current section */
> +                       entry->addr = inner_end;
> +                       entry->size = entry_end - inner_end;
> +               }
> +               /* Create new entry with intersected region */
> +               updater->new(table, inner_start, inner_end - inner_start, entry, data);
>
> -               if (entry->type != old_type)
> -                       continue;
> +               updated_size += inner_end - inner_start;
> +       } /* Else: [start, end) doesn't cover entry */
>
> -               entry_end = entry->addr + entry->size;
> +       return updated_size;
> +}
>
> -               /* Completely covered by new range? */
> -               if (entry->addr >= start && entry_end <= end) {
> -                       entry->type = new_type;
> -                       real_updated_size += entry->size;
> -                       continue;
> -               }
> +/** __e820__handle_range_update(): Helper function to update a address
> + * range in a e820_table
> + * @table: e820_table that we want to modify.
> + * @start: Start of the range.
> + * @size: Size of the range.
> + * @updater: Callbacks to modify the table.
> + * @data: Information to modify the table.
> + *
> + * Update the table @table in [@start, @start + @size) doing the
> + * actions given in @updater.
> + *
> + * Return: The updated size.
> + */
> +static u64 __init
> +__e820__handle_range_update(struct e820_table *table,
> +                           u64 start,
> +                           u64 size,
> +                           const struct e820_entry_updater *updater,
> +                           const void *data)
> +{
> +       u64 updated_size = 0;
> +       u64 end;
> +       unsigned int i;
>
> -               /* New range is completely covered? */
> -               if (entry->addr < start && entry_end > end) {
> -                       __e820__range_add(table, start, size, new_type);
> -                       __e820__range_add(table, end, entry_end - end, entry->type);
> -                       entry->size = start - entry->addr;
> -                       real_updated_size += size;
> -                       continue;
> -               }
> +       if (size > (ULLONG_MAX - start))
> +               size = ULLONG_MAX - start;
>
> -               /* Partially covered: */
> -               final_start = max(start, entry->addr);
> -               final_end = min(end, entry_end);
> -               if (final_start >= final_end)
> -                       continue;
> +       end = start + size;
>
> -               __e820__range_add(table, final_start, final_end - final_start, new_type);
> +       for (i = 0; i < table->nr_entries; i++) {
> +               struct e820_entry *entry = &table->entries[i];
> +               u64 entry_end = entry->addr + entry->size;
> +
> +               if (updater->should_update(data, entry)) {
> +                       /* Range completely covers entry */
> +                       if (entry->addr >= start && entry_end <= end) {
> +                               updater->update(entry, data);
> +                               updated_size += entry->size;
> +                       /* Entry completely covers range */
> +                       } else if (start > entry->addr && end < entry_end) {
> +                               /* Resize current entry */
> +                               entry->size = start - entry->addr;
> +
> +                               /* Create new entry with intersection region */
> +                               updater->new(table, start, size, entry, data);
> +
> +                               /*
> +                                * Create a new entry for the leftover
> +                                * of the current entry
> +                                */
> +                               __e820__range_add(table, end, entry_end - end,
> +                                                 entry->type);
> +
> +                               updated_size += size;
> +                       } else {
> +                               updated_size =
> +                                       __e820__handle_intersected_range_update(table, start, size,
> +                                                                               entry, updater, data);
> +                       }
> +               }
> +       }
>
> -               real_updated_size += final_end - final_start;
> +       return updated_size;
> +}
>
> -               /*
> -                * Left range could be head or tail, so need to update
> -                * its size first:
> -                */
> -               entry->size -= final_end - final_start;
> -               if (entry->addr < final_start)
> -                       continue;
> +static bool __init type_updater__should_update(const struct e820_entry *entry,
> +                                              const void *data)
> +{
> +       struct e820_type_updater_data *type_updater_data =
> +               (struct e820_type_updater_data *)data;

Please preserve const correctness. You are removing the const qualifier.

>
> -               entry->addr = final_end;
> -       }
> -       return real_updated_size;
> +       return entry->type == type_updater_data->old_type;
>  }
>
> -u64 __init e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
> +static void __init type_updater__update(struct e820_entry *entry,
> +                                       const void *data)
>  {
> -       return __e820__range_update(e820_table, start, size, old_type, new_type);
> +       struct e820_type_updater_data *type_updater_data =
> +               (struct e820_type_updater_data *)data;
> +
> +       entry->type = type_updater_data->new_type;
>  }
>
> -static u64 __init e820__range_update_kexec(u64 start, u64 size, enum e820_type old_type, enum e820_type  new_type)
> +static void __init type_updater__new(struct e820_table *table, u64 new_start,
> +                                    u64 new_size,
> +                                    const struct e820_entry *original,
> +                                    const void *data)
>  {
> -       return __e820__range_update(e820_table_kexec, start, size, old_type, new_type);
> +       struct e820_type_updater_data *type_updater_data =
> +               (struct e820_type_updater_data *)data;
> +
> +       __e820__range_add(table, new_start, new_size,
> +                         type_updater_data->new_type);
>  }
>
> -/* Remove a range of memory from the E820 table: */
> -u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
> +static u64 __init __e820__range_update(struct e820_table *table, u64 start,
> +                                      u64 size, enum e820_type old_type,
> +                                      enum e820_type new_type)
>  {
> -       int i;
> -       u64 end;
> -       u64 real_removed_size = 0;
> +       struct e820_entry_updater updater = {
> +               .should_update = type_updater__should_update,
> +               .update = type_updater__update,
> +               .new = type_updater__new
> +       };
>
> -       if (size > (ULLONG_MAX - start))
> -               size = ULLONG_MAX - start;
> +       struct e820_type_updater_data data = {
> +               .old_type = old_type,
> +               .new_type = new_type
> +       };
>
> -       end = start + size;
> -       printk(KERN_DEBUG "e820: remove [mem %#010Lx-%#010Lx] ", start, end - 1);
> -       if (check_type)
> -               e820_print_type(old_type);
> +       BUG_ON(old_type == new_type);
> +
> +       printk(KERN_DEBUG "e820: update [mem %#018Lx-%#018Lx] ", start,
> +              start + size - 1);
> +       e820_print_type(old_type);
> +       pr_cont(" ==> ");
> +       e820_print_type(new_type);
>         pr_cont("\n");
>
> -       for (i = 0; i < e820_table->nr_entries; i++) {
> -               struct e820_entry *entry = &e820_table->entries[i];
> -               u64 final_start, final_end;
> -               u64 entry_end;
> +       return __e820__handle_range_update(table, start, size, &updater, &data);
> +}
>
> -               if (check_type && entry->type != old_type)
> -                       continue;
> +static bool __init remover__should_update(const struct e820_entry *entry,
> +                                         const void *data)
> +{
> +       struct e820_remover_data *remover_data =
> +               (struct e820_remover_data *)data;
>
> -               entry_end = entry->addr + entry->size;
> +       return !remover_data->check_type ||
> +              entry->type == remover_data->old_type;
> +}
>
> -               /* Completely covered? */
> -               if (entry->addr >= start && entry_end <= end) {
> -                       real_removed_size += entry->size;
> -                       memset(entry, 0, sizeof(*entry));
> -                       continue;
> -               }
> +static void __init remover__update(struct e820_entry *entry, const void *data)
> +{
> +       memset(entry, 0, sizeof(*entry));
> +}
>
> -               /* Is the new range completely covered? */
> -               if (entry->addr < start && entry_end > end) {
> -                       e820__range_add(end, entry_end - end, entry->type);
> -                       entry->size = start - entry->addr;
> -                       real_removed_size += size;
> -                       continue;
> -               }
> +static void __init remover__new(struct e820_table *table, u64 new_start,
> +                               u64 new_size, const struct e820_entry *original,
> +                               const void *data)
> +{
> +}
>
> -               /* Partially covered: */
> -               final_start = max(start, entry->addr);
> -               final_end = min(end, entry_end);
> -               if (final_start >= final_end)
> -                       continue;
> +/**
> + * e820__range_remove() - Remove an address range from e820_table.
> + * @start: Start of the address range.
> + * @size: Size of the address range.
> + * @old_type: Type of the entries that we want to remove.
> + * @check_type: Bool to decide if ignore @old_type or not.
> + *
> + * Remove [@start, @start + @size) from e820_table. If @check_type is
> + * true remove only entries with type @old_type.
> + *
> + * Return: The size removed.
> + */
> +u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type,
> +                             bool check_type)
> +{
> +       struct e820_entry_updater updater = {
> +               .should_update = remover__should_update,
> +               .update = remover__update,
> +               .new = remover__new
> +       };
> +
> +       struct e820_remover_data data = {
> +               .check_type = check_type,
> +               .old_type = old_type
> +       };
> +
> +       printk(KERN_DEBUG "e820: remove [mem %#018Lx-%#018Lx] ", start,
> +              start + size - 1);
> +       if (check_type)
> +               e820_print_type(old_type);
> +       pr_cont("\n");
>
> -               real_removed_size += final_end - final_start;
> +       return __e820__handle_range_update(e820_table, start, size, &updater,
> +                                           &data);
> +}
>
> -               /*
> -                * Left range could be head or tail, so need to update
> -                * the size first:
> -                */
> -               entry->size -= final_end - final_start;
> -               if (entry->addr < final_start)
> -                       continue;
> +/**
> + * e820__range_update() - Update the type of a given address range in
> + * e820_table.
> + * @start: Start of the range.
> + * @size: Size of the range.
> + * @old_type: Type that we want to change.
> + * @new_type: New type to replace @old_type.
> + *
> + * Update type of addresses in [@start, @start + @size) from @old_type
> + * to @new_type in e820_table.
> + *
> + * Return: The size updated.
> + */
> +u64 __init e820__range_update(u64 start, u64 size, enum e820_type old_type,
> +                             enum e820_type new_type)
> +{
> +       return __e820__range_update(e820_table, start, size, old_type, new_type);
> +}
>
> -               entry->addr = final_end;
> -       }
> -       return real_removed_size;
> +/**
> + * e820__range_update_kexec() - Update the type of a given address
> + * range in e820_table_kexec.
> + * @start: Start of the range.
> + * @size: Size of the range.
> + * @old_type: Type that we want to change.
> + * @new_type: New type to replace @old_type.
> + *
> + * Update type of addresses in [@start, @start + @size) from @old_type
> + * to @new_type in e820_table_kexec.
> + *
> + * Return: The size updated.
> + */
> +static u64 __init e820__range_update_kexec(u64 start, u64 size,
> +                                          enum e820_type old_type,
> +                                          enum e820_type new_type)
> +{
> +       return __e820__range_update(e820_table_kexec, start, size, old_type, new_type);
>  }
>
>  void __init e820__update_table_print(void)
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-07 21:45   ` Kees Cook
  2022-02-08  8:40     ` Mike Rapoport
@ 2022-02-08 21:09     ` Martin Fernandez
  2022-03-04 20:32     ` Martin Fernandez
  2 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-08 21:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On 2/7/22, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Feb 03, 2022 at 01:43:25PM -0300, Martin Fernandez wrote:
>> __e820__range_update and e820__range_remove had a very similar
>> I propose a refactor of those functions, given that I need to create a
>> similar one for this patchset.
>
> The diff here is pretty hard (for me) to review; I'll need more time
> to check it. What might make review easier (at least for me), is to
> incrementally change these routines. i.e. separate patches to:
>
> - add the new infrastructure
> - replace e820__range_remove
> - replace __e820__range_update
>
> If that's not actually useful, no worries. I'll just stare at it a bit
> more. :)
>

Yep, that's a good idea. I'll keep that in mind for the next patch.

>>
>> Add a function to modify a E820 table in a given range. This
>> modification is done backed up by two helper structs:
>> e820_entry_updater and e820_*_data.
>>
>> The first one, e820_entry_updater, carries 3 callbacks which function
>> as the actions to take on the table.
>>
>> The other one, e820_*_data carries information needed by the
>> callbacks, for example in the case of range_update it will carry the
>> type that we are targeting.
>
> Something I think would be really amazing here is if you could add KUnit
> tests here to exercise the corner cases and validate the changes. It
> should be pretty easy to add. Here's a quick example for the boilerplate
> and testing a bit of __e820__range_add():
>
> #ifdef CONFIG_E820_KUNIT_TEST
> #include <kunit/test.h>
>
> static void __init test_e820_range_add(struct kunit *context)
> {
> 	struct e820_table table;
> 	u32 full;
>
> 	full = ARRAY_SIZE(table.entries);
> 	/* Add last entry. */
> 	table->nr_entries = full - 1;
> 	__e820__range_add(&table, 0, 15, 0);
> 	KUNIT_EXPECT_EQ(table->nr_entries, full)
> 	/* Skip new entry when full. */
> 	__e820__range_add(&table, 0, 15, 0);
> 	KUNIT_EXPECT_EQ(table->nr_entries, full)
> }
>
> static void __init test_e820_update(struct kunit *context)
> {
> ...
> }
>
> static struct kunit_case __refdata e820_test_cases[] = {
>         KUNIT_CASE(test_e820_range_add),
>         KUNIT_CASE(test_e820_update),
> 	...
>         {}
> };
>
> static struct kunit_suite e820_test_suite = {
>         .name = "e820",
>         .test_cases = e820_test_cases,
> };
>
> kunit_test_suites(&e820_test_suite);
> #endif
>

Oh that's awesome! I'll definitely take a look into KUnit and integrate
it to this patch. Thanks for the code snippet!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-08 21:01       ` Martin Fernandez
@ 2022-02-15  7:10         ` Mike Rapoport
  2022-02-15 14:14           ` Martin Fernandez
  0 siblings, 1 reply; 37+ messages in thread
From: Mike Rapoport @ 2022-02-15  7:10 UTC (permalink / raw)
  To: Martin Fernandez
  Cc: Kees Cook, linux-kernel, linux-efi, platform-driver-x86,
	linux-mm, tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart,
	andy, gregkh, rafael, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield

Hi Martin,

On Tue, Feb 08, 2022 at 06:01:21PM -0300, Martin Fernandez wrote:
> On 2/8/22, Mike Rapoport <rppt@kernel.org> wrote:
> > On Mon, Feb 07, 2022 at 01:45:40PM -0800, Kees Cook wrote:
> >> On Thu, Feb 03, 2022 at 01:43:25PM -0300, Martin Fernandez wrote:
> >> > __e820__range_update and e820__range_remove had a very similar
> >> > implementation with a few lines different from each other, the lines
> >> > that actually perform the modification over the e820_table. The
> >> > similiraties were found in the checks for the different cases on how
> >> > each entry intersects with the given range (if it does at all). These
> >> > checks were very presice and error prone so it was not a good idea to
> >> > have them in both places.
> >>
> >> Yay removing copy/paste code! :)
> >
> > Removing copy/paste is nice but diffstat of
> >
> >  arch/x86/kernel/e820.c | 383 ++++++++++++++++++++++++++++++-----------
> >  1 file changed, 283 insertions(+), 100 deletions(-)
> >
> > does not look nice even accounting for lots of comments :(
> >
> > I didn't look closely, but diffstat clues that the refactoring making
> > things much more complex.
> >
> 
> Yes, that diffstat surprised me as well.
> 
> I have to mention that 110 of those lines are kerneldocs and blank
> lines, which is quite a lot. Also you have to take into account that I
> expanded most of the function definitions for better formatting, which
> also took some space.

At last I had time to look more closely and I think that using a set of
callbacks is over-complicated.

I think this can be done way simpler, e.g like this (untested) draft:

https://git.kernel.org/rppt/h/x86/e820-update-range


> And as I was able to focus the "hard" part of the problem into a
> single function, testing can be done easily as Kees suggested and I'm
> planning to do so in the next patch.



-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-15  7:10         ` Mike Rapoport
@ 2022-02-15 14:14           ` Martin Fernandez
  0 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-02-15 14:14 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Kees Cook, linux-kernel, linux-efi, platform-driver-x86,
	linux-mm, tglx, mingo, bp, dave.hansen, x86, hpa, ardb, dvhart,
	andy, gregkh, rafael, akpm, daniel.gutson, hughsient,
	alex.bazhaniuk, alison.schofield

On 2/15/22, Mike Rapoport <rppt@kernel.org> wrote:
> Hi Martin,
>
> On Tue, Feb 08, 2022 at 06:01:21PM -0300, Martin Fernandez wrote:
>> On 2/8/22, Mike Rapoport <rppt@kernel.org> wrote:
>> > On Mon, Feb 07, 2022 at 01:45:40PM -0800, Kees Cook wrote:
>> >> On Thu, Feb 03, 2022 at 01:43:25PM -0300, Martin Fernandez wrote:
>> >> > __e820__range_update and e820__range_remove had a very similar
>> >> > implementation with a few lines different from each other, the lines
>> >> > that actually perform the modification over the e820_table. The
>> >> > similiraties were found in the checks for the different cases on how
>> >> > each entry intersects with the given range (if it does at all).
>> >> > These
>> >> > checks were very presice and error prone so it was not a good idea
>> >> > to
>> >> > have them in both places.
>> >>
>> >> Yay removing copy/paste code! :)
>> >
>> > Removing copy/paste is nice but diffstat of
>> >
>> >  arch/x86/kernel/e820.c | 383 ++++++++++++++++++++++++++++++-----------
>> >  1 file changed, 283 insertions(+), 100 deletions(-)
>> >
>> > does not look nice even accounting for lots of comments :(
>> >
>> > I didn't look closely, but diffstat clues that the refactoring making
>> > things much more complex.
>> >
>>
>> Yes, that diffstat surprised me as well.
>>
>> I have to mention that 110 of those lines are kerneldocs and blank
>> lines, which is quite a lot. Also you have to take into account that I
>> expanded most of the function definitions for better formatting, which
>> also took some space.
>
> At last I had time to look more closely and I think that using a set of
> callbacks is over-complicated.
>
> I think this can be done way simpler, e.g like this (untested) draft:
>
> https://git.kernel.org/rppt/h/x86/e820-update-range
>

Thanks for taking the time to reviewing it.

Yeah, I did something like that in a previous version. Altough I
wasn't really happy with that.

https://lore.kernel.org/linux-efi/20220113213027.457282-4-martin.fernandez@eclypsium.com/

I think that with the struct with the function arguments looks more
clear than what I did, but you have to take into account that I need
to create yet
another function similar to those and another parameter to the struct,
and with that I think that __e820__range_update will look scary.

I'll give it a try anyway!

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove
  2022-02-07 21:45   ` Kees Cook
  2022-02-08  8:40     ` Mike Rapoport
  2022-02-08 21:09     ` Martin Fernandez
@ 2022-03-04 20:32     ` Martin Fernandez
  2 siblings, 0 replies; 37+ messages in thread
From: Martin Fernandez @ 2022-03-04 20:32 UTC (permalink / raw)
  To: Kees Cook
  Cc: linux-kernel, linux-efi, platform-driver-x86, linux-mm, tglx,
	mingo, bp, dave.hansen, x86, hpa, ardb, dvhart, andy, gregkh,
	rafael, rppt, akpm, daniel.gutson, hughsient, alex.bazhaniuk,
	alison.schofield

On 2/7/22, Kees Cook <keescook@chromium.org> wrote:
> Something I think would be really amazing here is if you could add KUnit
> tests here to exercise the corner cases and validate the changes. It
> should be pretty easy to add. Here's a quick example for the boilerplate
> and testing a bit of __e820__range_add():
>
> #ifdef CONFIG_E820_KUNIT_TEST
> #include <kunit/test.h>
>
> static void __init test_e820_range_add(struct kunit *context)
> {
> 	struct e820_table table;
> 	u32 full;
>
> 	full = ARRAY_SIZE(table.entries);
> 	/* Add last entry. */
> 	table->nr_entries = full - 1;
> 	__e820__range_add(&table, 0, 15, 0);
> 	KUNIT_EXPECT_EQ(table->nr_entries, full)
> 	/* Skip new entry when full. */
> 	__e820__range_add(&table, 0, 15, 0);
> 	KUNIT_EXPECT_EQ(table->nr_entries, full)
> }
>
> static void __init test_e820_update(struct kunit *context)
> {
> ...
> }
>
> static struct kunit_case __refdata e820_test_cases[] = {
>         KUNIT_CASE(test_e820_range_add),
>         KUNIT_CASE(test_e820_update),
> 	...
>         {}
> };
>
> static struct kunit_suite e820_test_suite = {
>         .name = "e820",
>         .test_cases = e820_test_cases,
> };
>
> kunit_test_suites(&e820_test_suite);
> #endif

I almost got it. Although when added the tests I have a warning
when compiling, because KUnit doens't want to deal with __init things:

    WARNING: modpost: vmlinux.o(.data+0x26800): Section mismatch in
reference from the variable __UNIQUE_ID_array286 to the variable
.init.data:e820_test_suite
    The variable __UNIQUE_ID_array286 references
    the variable __initdata e820_test_suite
    If the reference is valid then annotate the
    variable with __init* or __refdata (see linux/init.h) or name the variable:
    *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console

I need to test __init functions. I couldn't find any other similar
cases in existant code. Is there a nice way to solve this?

I'm adding the file that contains the tests just in case..


#include <kunit/test.h>

#include <asm/e820/api.h>
#include <asm/setup.h>

#define KUNIT_EXPECT_E820_ENTRY_EQ(test, entry, _addr, _size, _type,           \
				   _crypto_capable)                            \
	do {                                                                   \
		KUNIT_EXPECT_EQ((test), (entry).addr, (_addr));                \
		KUNIT_EXPECT_EQ((test), (entry).size, (_size));                \
		KUNIT_EXPECT_EQ((test), (entry).type, (_type));                \
		KUNIT_EXPECT_EQ((test), (entry).crypto_capable,                \
				(_crypto_capable));                            \
	} while (0)

struct e820_table test_table __initdata;

static void __init test_e820_range_add(struct kunit *test)
{
        u32 full;

        full = ARRAY_SIZE(test_table.entries);
        /* Add last entry. */
        test_table.nr_entries = full - 1;
        __e820__range_add(&test_table, 0, 15, 0, 0);
        KUNIT_EXPECT_EQ(test, test_table.nr_entries, full);
        /* Skip new entry when full. */
        __e820__range_add(&test_table, 0, 15, 0, 0);
        KUNIT_EXPECT_EQ(test, test_table.nr_entries, full);
}

static void __init test_e820_range_update(struct kunit *test)
{
	u64 entry_size = 15;
	u64 updated_size = 0;
	/* Initialize table */
	test_table.nr_entries = 0;
	__e820__range_add(&test_table, 0, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);
	__e820__range_add(&test_table, entry_size, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);
	__e820__range_add(&test_table, entry_size * 2, entry_size,
			  E820_TYPE_ACPI, E820_NOT_CRYPTO_CAPABLE);

	updated_size = __e820__range_update(&test_table, 0, entry_size * 2,
					    E820_TYPE_RAM, E820_TYPE_RESERVED);
	/* The first 2 regions were updated */
	KUNIT_EXPECT_EQ(test, updated_size, entry_size * 2);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], 0, entry_size,
				   E820_TYPE_RESERVED, E820_NOT_CRYPTO_CAPABLE);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], entry_size,
				   entry_size, E820_TYPE_RESERVED,
				   E820_NOT_CRYPTO_CAPABLE);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[2], entry_size * 2,
				   entry_size, E820_TYPE_ACPI,
				   E820_NOT_CRYPTO_CAPABLE);

	updated_size = __e820__range_update(&test_table, 0, entry_size * 3,
					    E820_TYPE_RESERVED, E820_TYPE_RAM);
	/* Only the first 2 regions were updated */
	KUNIT_EXPECT_EQ(test, updated_size, entry_size * 2);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], 0, entry_size,
				   E820_TYPE_RAM, E820_NOT_CRYPTO_CAPABLE);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], entry_size,
				   entry_size, E820_TYPE_RAM,
				   E820_NOT_CRYPTO_CAPABLE);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[2], entry_size * 2,
				   entry_size, E820_TYPE_ACPI,
				   E820_NOT_CRYPTO_CAPABLE);
}

static void __init test_e820_range_remove(struct kunit *test)
{
	u64 entry_size = 15;
	u64 removed_size = 0;

	struct e820_entry_updater updater = {
		.should_update = remover__should_update,
		.update = remover__update,
		.new = remover__new
	};

	struct e820_remover_data data = {
		.check_type = true,
		.old_type = E820_TYPE_RAM
	};

	/* Initialize table */
	test_table.nr_entries = 0;
	__e820__range_add(&test_table, 0, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);
	__e820__range_add(&test_table, entry_size, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);
	__e820__range_add(&test_table, entry_size * 2, entry_size,
			  E820_TYPE_ACPI, E820_NOT_CRYPTO_CAPABLE);

	/*
	 * Need to use __e820__handle_range_update because
	 * e820__range_remove doesn't ask for the table
	 */
	removed_size = __e820__handle_range_update(&test_table,
						   0, entry_size * 2,
						   &updater, &data);
	/* The first two regions were removed */
	KUNIT_EXPECT_EQ(test, removed_size, entry_size * 2);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], 0, 0, 0, 0);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], 0, 0, 0, 0);

	removed_size = __e820__handle_range_update(&test_table,
						   0, entry_size * 3,
						   &updater, &data);
	/* Nothing was removed */
	KUNIT_EXPECT_EQ(test, removed_size, 0);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], 0, 0, 0, 0);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], 0, 0, 0, 0);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[2], entry_size * 2,
				   entry_size, E820_TYPE_ACPI,
				   E820_NOT_CRYPTO_CAPABLE);
}

static void __init test_e820_range_crypto_update(struct kunit *test)
{
	u64 entry_size = 15;
	u64 updated_size = 0;
	/* Initialize table */
	test_table.nr_entries = 0;
	__e820__range_add(&test_table, 0, entry_size, E820_TYPE_RAM,
			  E820_CRYPTO_CAPABLE);
	__e820__range_add(&test_table, entry_size, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);
	__e820__range_add(&test_table, entry_size * 2, entry_size,
			  E820_TYPE_RAM, E820_CRYPTO_CAPABLE);

	updated_size = __e820__range_update_crypto(&test_table, 0, entry_size * 3,
						   E820_CRYPTO_CAPABLE);
	/* Only the region in the middle was updated */
	KUNIT_EXPECT_EQ(test, updated_size, entry_size);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], 0, entry_size,
				   E820_TYPE_RAM, E820_CRYPTO_CAPABLE);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], entry_size,
				   entry_size, E820_TYPE_RAM,
				   E820_CRYPTO_CAPABLE);
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[2], entry_size * 2,
				   entry_size, E820_TYPE_RAM,
				   E820_CRYPTO_CAPABLE);
}

static void __init test_e820_handle_range_update_intersection(struct
kunit *test)
{
	struct e820_entry_updater updater = {
		.should_update = type_updater__should_update,
		.update = type_updater__update,
		.new = type_updater__new
	};

	struct e820_type_updater_data data = {
		.old_type = E820_TYPE_RAM,
		.new_type = E820_TYPE_RESERVED
	};

	u64 entry_size = 15;
	u64 updated_size = 0;
	/* Initialize table */
	test_table.nr_entries = 0;
	__e820__range_add(&test_table, 0, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);

	updated_size = __e820__handle_range_update(&test_table,
						   0, entry_size - 2,
						   &updater, &data);

	KUNIT_EXPECT_EQ(test, updated_size, entry_size - 2);

	/* There is a new entry */
	KUNIT_EXPECT_EQ(test, test_table.nr_entries, 2);

	/* The original entry now is moved */
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], entry_size - 2,
				   2, E820_TYPE_RAM, E820_NOT_CRYPTO_CAPABLE);

	/* The new entry has the correct values */
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], 0, 13,
				   E820_TYPE_RESERVED, E820_NOT_CRYPTO_CAPABLE);
}

static void __init test_e820_handle_range_update_inside(struct kunit *test)
{
	struct e820_entry_updater updater = {
		.should_update = type_updater__should_update,
		.update = type_updater__update,
		.new = type_updater__new
	};

	struct e820_type_updater_data data = {
		.old_type = E820_TYPE_RAM,
		.new_type = E820_TYPE_RESERVED
	};

	u64 entry_size = 15;
	u64 updated_size = 0;
	/* Initialize table */
	test_table.nr_entries = 0;
	__e820__range_add(&test_table, 0, entry_size, E820_TYPE_RAM,
			  E820_NOT_CRYPTO_CAPABLE);

	updated_size = __e820__handle_range_update(&test_table,
						   5, entry_size - 10,
						   &updater, &data);

	KUNIT_EXPECT_EQ(test, updated_size, entry_size - 10);

	/* There are two new entrie */
	KUNIT_EXPECT_EQ(test, test_table.nr_entries, 3);

	/* The original entry now shrinked */
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[0], 0, 5,
				   E820_TYPE_RAM, E820_NOT_CRYPTO_CAPABLE);

	/* The new entries have the correct values */
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[1], 5,
				   entry_size - 10, E820_TYPE_RESERVED,
				   E820_NOT_CRYPTO_CAPABLE);
	/* Left over of the original region */
	KUNIT_EXPECT_E820_ENTRY_EQ(test, test_table.entries[2], entry_size - 5,
				   5, E820_TYPE_RAM, E820_NOT_CRYPTO_CAPABLE);
}

static struct kunit_case e820_test_cases[] __initdata = {
        KUNIT_CASE(test_e820_range_add),
        KUNIT_CASE(test_e820_range_update),
        KUNIT_CASE(test_e820_range_remove),
        KUNIT_CASE(test_e820_range_crypto_update),
        KUNIT_CASE(test_e820_handle_range_update_intersection),
        KUNIT_CASE(test_e820_handle_range_update_inside),
        {}
};

static struct kunit_suite e820_test_suite __initdata = {
        .name = "e820",
        .test_cases = e820_test_cases,
};

kunit_test_suite(e820_test_suite);

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2022-03-04 20:34 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-03 16:43 [PATCH v6 0/6] x86: Show in sysfs if a memory node is able to do encryption Martin Fernandez
2022-02-03 16:43 ` [PATCH v6 1/6] mm/memblock: Tag memblocks with crypto capabilities Martin Fernandez
2022-02-03 18:07   ` Mike Rapoport
2022-02-03 18:24     ` Martin Fernandez
2022-02-07 21:18   ` Kees Cook
2022-02-08 14:39     ` Martin Fernandez
2022-02-03 16:43 ` [PATCH v6 2/6] mm/mmzone: Tag pg_data_t " Martin Fernandez
2022-02-07 21:19   ` Kees Cook
2022-02-03 16:43 ` [PATCH v6 3/6] x86/e820: Refactor range_update and range_remove Martin Fernandez
2022-02-07 21:45   ` Kees Cook
2022-02-08  8:40     ` Mike Rapoport
2022-02-08 21:01       ` Martin Fernandez
2022-02-15  7:10         ` Mike Rapoport
2022-02-15 14:14           ` Martin Fernandez
2022-02-08 21:09     ` Martin Fernandez
2022-03-04 20:32     ` Martin Fernandez
2022-02-08 21:04   ` Daniel Gutson
2022-02-03 16:43 ` [PATCH v6 4/6] x86/e820: Tag e820_entry with crypto capabilities Martin Fernandez
2022-02-07 21:56   ` Kees Cook
2022-02-08 14:46     ` Martin Fernandez
2022-02-03 16:43 ` [PATCH v6 5/6] x86/efi: Tag e820_entries as crypto capable from EFI memmap Martin Fernandez
2022-02-03 16:43 ` [PATCH v6 6/6] drivers/node: Show in sysfs node's crypto capabilities Martin Fernandez
2022-02-04  3:47   ` Limonciello, Mario
2022-02-04 13:21     ` Martin Fernandez
2022-02-04 15:59       ` Tom Lendacky
2022-02-04 16:23         ` Limonciello, Mario
2022-02-04 16:28           ` Borislav Petkov
2022-02-04 17:12             ` Tom Lendacky
2022-02-04 17:49               ` Limonciello, Mario
2022-02-04 18:00               ` Borislav Petkov
2022-02-04 18:49                 ` Tom Lendacky
2022-02-04 21:49                   ` Borislav Petkov
2022-02-07  3:39             ` Kees Cook
2022-02-07 10:02               ` Borislav Petkov
2022-02-04  4:56   ` Mike Rapoport
2022-02-04 12:27     ` Martin Fernandez
2022-02-04 13:37       ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).