LKML Archive on lore.kernel.org
help / color / mirror / Atom feed
* [PATCH v3 0/2] Optimize dma_*_from_contiguous calls
@ 2019-05-24 4:06 Nicolin Chen
2019-05-24 4:06 ` [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous() Nicolin Chen
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Nicolin Chen @ 2019-05-24 4:06 UTC (permalink / raw)
To: hch, robin.murphy, m.szyprowski
Cc: vdumpa, linux, catalin.marinas, will.deacon, chris, jcmvbkbc,
joro, dwmw2, tony, akpm, sfr, treding, keescook, iamjoonsoo.kim,
wsa+renesas, linux-arm-kernel, linux-kernel, linux-xtensa, iommu,
dann.frazier
[ Per discussion at v1, we decide to add two new functions and start
replacing callers one by one. For this series, it only touches the
dma-direct part. And instead of merging two PATCHes, I still keep
them separate so that we may easily revert PATCH-2 if anything bad
happens as last time -- PATCH-1 is supposed to be a safe cleanup. ]
This series of patches try to optimize dma_*_from_contiguous calls:
PATCH-1 abstracts two new functions and applies to dma-direct.c file.
PATCH-2 saves single pages and reduce fragmentations from CMA area.
Please check their commit messages for detail changelog.
Nicolin Chen (2):
dma-contiguous: Abstract dma_{alloc,free}_contiguous()
dma-contiguous: Use fallback alloc_pages for single pages
include/linux/dma-contiguous.h | 11 +++++++
kernel/dma/contiguous.c | 57 ++++++++++++++++++++++++++++++++++
kernel/dma/direct.c | 24 +++-----------
3 files changed, 72 insertions(+), 20 deletions(-)
--
2.17.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-05-24 4:06 [PATCH v3 0/2] Optimize dma_*_from_contiguous calls Nicolin Chen
@ 2019-05-24 4:06 ` Nicolin Chen
2019-05-29 18:35 ` Nathan Chancellor
2019-07-25 16:06 ` Ezequiel Garcia
2019-05-24 4:06 ` [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages Nicolin Chen
` (2 subsequent siblings)
3 siblings, 2 replies; 14+ messages in thread
From: Nicolin Chen @ 2019-05-24 4:06 UTC (permalink / raw)
To: hch, robin.murphy, m.szyprowski
Cc: vdumpa, linux, catalin.marinas, will.deacon, chris, jcmvbkbc,
joro, dwmw2, tony, akpm, sfr, treding, keescook, iamjoonsoo.kim,
wsa+renesas, linux-arm-kernel, linux-kernel, linux-xtensa, iommu,
dann.frazier
Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
are very simply implemented, but requiring callers to pass certain
parameters like count and align, and taking a boolean parameter to
check __GFP_NOWARN in the allocation flags. So every function call
duplicates similar work:
/* A piece of example */
unsigned long order = get_order(size);
size_t count = size >> PAGE_SHIFT;
page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
[...]
dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
Additionally, as CMA can be used only in the context which permits
sleeping, most of callers do a gfpflags_allow_blocking() check and
a corresponding fallback allocation of normal pages upon any false
result:
/* A piece of example */
if (gfpflags_allow_blocking(flag))
page = dma_alloc_from_contiguous();
if (!page)
page = alloc_pages();
[...]
if (!dma_release_from_contiguous(dev, page, count))
__free_pages(page, get_order(size));
So this patch simplifies those function calls by abstracting these
operations into the two new functions: dma_{alloc,free}_contiguous.
As some callers of dma_{alloc,release}_from_contiguous() might be
complicated, this patch just implements these two new functions to
kernel/dma/direct.c only as an initial step.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
Changelog
v2->v3:
* Added missing "static inline" in header file to fix build error.
v1->v2:
* Added new functions beside the old ones so we can replace callers
one by one later.
* Applied new functions to dma/direct.c only, because it's the best
example caller to apply and should be safe with the new functions.
include/linux/dma-contiguous.h | 11 ++++++++
kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
kernel/dma/direct.c | 24 +++--------------
3 files changed, 63 insertions(+), 20 deletions(-)
diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index f247e8aa5e3d..00a370c1c140 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
unsigned int order, bool no_warn);
bool dma_release_from_contiguous(struct device *dev, struct page *pages,
int count);
+struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
+void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
#else
@@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
return false;
}
+static inline
+struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
+{
+ return NULL;
+}
+
+static inline
+void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
+
#endif
#endif
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b2a87905846d..21f39a6cb04f 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
return cma_release(dev_get_cma_area(dev), pages, count);
}
+/**
+ * dma_alloc_contiguous() - allocate contiguous pages
+ * @dev: Pointer to device for which the allocation is performed.
+ * @size: Requested allocation size.
+ * @gfp: Allocation flags.
+ *
+ * This function allocates contiguous memory buffer for specified device. It
+ * first tries to use device specific contiguous memory area if available or
+ * the default global one, then tries a fallback allocation of normal pages.
+ */
+struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
+{
+ int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
+ size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+ size_t align = get_order(PAGE_ALIGN(size));
+ struct cma *cma = dev_get_cma_area(dev);
+ struct page *page = NULL;
+
+ /* CMA can be used only in the context which permits sleeping */
+ if (cma && gfpflags_allow_blocking(gfp)) {
+ align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
+ page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
+ }
+
+ /* Fallback allocation of normal pages */
+ if (!page)
+ page = alloc_pages_node(node, gfp, align);
+
+ return page;
+}
+
+/**
+ * dma_free_contiguous() - release allocated pages
+ * @dev: Pointer to device for which the pages were allocated.
+ * @page: Pointer to the allocated pages.
+ * @size: Size of allocated pages.
+ *
+ * This function releases memory allocated by dma_alloc_contiguous(). As the
+ * cma_release returns false when provided pages do not belong to contiguous
+ * area and true otherwise, this function then does a fallback __free_pages()
+ * upon a false-return.
+ */
+void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
+{
+ if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
+ __free_pages(page, get_order(size));
+}
+
/*
* Support for reserved memory regions defined in device tree
*/
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 2c2772e9702a..0816c1e8b05a 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
{
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
- int page_order = get_order(size);
struct page *page = NULL;
u64 phys_mask;
@@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
&phys_mask);
again:
- /* CMA can be used only in the context which permits sleeping */
- if (gfpflags_allow_blocking(gfp)) {
- page = dma_alloc_from_contiguous(dev, count, page_order,
- gfp & __GFP_NOWARN);
- if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
- dma_release_from_contiguous(dev, page, count);
- page = NULL;
- }
- }
- if (!page)
- page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
-
+ page = dma_alloc_contiguous(dev, size, gfp);
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
- __free_pages(page, page_order);
+ dma_free_contiguous(dev, page, size);
page = NULL;
if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
@@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
if (PageHighMem(page)) {
/*
* Depending on the cma= arguments and per-arch setup
- * dma_alloc_from_contiguous could return highmem pages.
+ * dma_alloc_contiguous could return highmem pages.
* Without remapping there is no way to return them here,
* so log an error and fail.
*/
@@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
{
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-
- if (!dma_release_from_contiguous(dev, page, count))
- __free_pages(page, get_order(size));
+ dma_free_contiguous(dev, page, size);
}
void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
--
2.17.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages
2019-05-24 4:06 [PATCH v3 0/2] Optimize dma_*_from_contiguous calls Nicolin Chen
2019-05-24 4:06 ` [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous() Nicolin Chen
@ 2019-05-24 4:06 ` Nicolin Chen
2019-05-24 16:16 ` Ira Weiny
2019-05-24 20:03 ` [PATCH v3 0/2] Optimize dma_*_from_contiguous calls dann frazier
2019-05-28 6:04 ` Christoph Hellwig
3 siblings, 1 reply; 14+ messages in thread
From: Nicolin Chen @ 2019-05-24 4:06 UTC (permalink / raw)
To: hch, robin.murphy, m.szyprowski
Cc: vdumpa, linux, catalin.marinas, will.deacon, chris, jcmvbkbc,
joro, dwmw2, tony, akpm, sfr, treding, keescook, iamjoonsoo.kim,
wsa+renesas, linux-arm-kernel, linux-kernel, linux-xtensa, iommu,
dann.frazier
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.
However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.
This patch tries to use the fallback alloc_pages path, instead of
one-page size allocations from the global CMA area in case that a
device does not have its own CMA area. This'd save resources from
the CMA global area for more CMA allocations, and also reduce CMA
fragmentations resulted from trivial allocations.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
---
kernel/dma/contiguous.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 21f39a6cb04f..6914b92d5c88 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
* This function allocates contiguous memory buffer for specified device. It
* first tries to use device specific contiguous memory area if available or
* the default global one, then tries a fallback allocation of normal pages.
+ *
+ * Note that it byapss one-page size of allocations from the global area as
+ * the addresses within one page are always contiguous, so there is no need
+ * to waste CMA pages for that kind; it also helps reduce fragmentations.
*/
struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
{
int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
size_t align = get_order(PAGE_ALIGN(size));
- struct cma *cma = dev_get_cma_area(dev);
struct page *page = NULL;
+ struct cma *cma = NULL;
+
+ if (dev && dev->cma_area)
+ cma = dev->cma_area;
+ else if (count > 1)
+ cma = dma_contiguous_default_area;
/* CMA can be used only in the context which permits sleeping */
if (cma && gfpflags_allow_blocking(gfp)) {
--
2.17.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages
2019-05-24 4:06 ` [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages Nicolin Chen
@ 2019-05-24 16:16 ` Ira Weiny
2019-05-27 10:55 ` Nicolin Chen
0 siblings, 1 reply; 14+ messages in thread
From: Ira Weiny @ 2019-05-24 16:16 UTC (permalink / raw)
To: Nicolin Chen
Cc: hch, robin.murphy, m.szyprowski, vdumpa, linux, catalin.marinas,
will.deacon, chris, jcmvbkbc, joro, dwmw2, tony, akpm, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
linux-kernel, linux-xtensa, iommu, dann.frazier
On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote:
> The addresses within a single page are always contiguous, so it's
> not so necessary to always allocate one single page from CMA area.
> Since the CMA area has a limited predefined size of space, it may
> run out of space in heavy use cases, where there might be quite a
> lot CMA pages being allocated for single pages.
>
> However, there is also a concern that a device might care where a
> page comes from -- it might expect the page from CMA area and act
> differently if the page doesn't.
How does a device know, after this call, if a CMA area was used? From the
patches I figured a device should not care.
>
> This patch tries to use the fallback alloc_pages path, instead of
> one-page size allocations from the global CMA area in case that a
> device does not have its own CMA area. This'd save resources from
> the CMA global area for more CMA allocations, and also reduce CMA
> fragmentations resulted from trivial allocations.
>
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> ---
> kernel/dma/contiguous.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 21f39a6cb04f..6914b92d5c88 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> * This function allocates contiguous memory buffer for specified device. It
> * first tries to use device specific contiguous memory area if available or
> * the default global one, then tries a fallback allocation of normal pages.
> + *
> + * Note that it byapss one-page size of allocations from the global area as
> + * the addresses within one page are always contiguous, so there is no need
> + * to waste CMA pages for that kind; it also helps reduce fragmentations.
> */
> struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> {
> int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> size_t align = get_order(PAGE_ALIGN(size));
> - struct cma *cma = dev_get_cma_area(dev);
> struct page *page = NULL;
> + struct cma *cma = NULL;
> +
> + if (dev && dev->cma_area)
> + cma = dev->cma_area;
> + else if (count > 1)
> + cma = dma_contiguous_default_area;
Doesn't dev_get_dma_area() already do this?
Ira
>
> /* CMA can be used only in the context which permits sleeping */
> if (cma && gfpflags_allow_blocking(gfp)) {
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls
2019-05-24 4:06 [PATCH v3 0/2] Optimize dma_*_from_contiguous calls Nicolin Chen
2019-05-24 4:06 ` [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous() Nicolin Chen
2019-05-24 4:06 ` [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages Nicolin Chen
@ 2019-05-24 20:03 ` dann frazier
2019-05-28 6:04 ` Christoph Hellwig
3 siblings, 0 replies; 14+ messages in thread
From: dann frazier @ 2019-05-24 20:03 UTC (permalink / raw)
To: Nicolin Chen
Cc: Christoph Hellwig, Robin Murphy, Marek Szyprowski, vdumpa, linux,
Catalin Marinas, Will Deacon, chris, jcmvbkbc, joro, dwmw2, tony,
akpm, sfr, treding, keescook, iamjoonsoo.kim, wsa+renesas,
linux-arm-kernel, linux-kernel, linux-xtensa, iommu
On Thu, May 23, 2019 at 10:08 PM Nicolin Chen <nicoleotsuka@gmail.com> wrote:
>
> [ Per discussion at v1, we decide to add two new functions and start
> replacing callers one by one. For this series, it only touches the
> dma-direct part. And instead of merging two PATCHes, I still keep
> them separate so that we may easily revert PATCH-2 if anything bad
> happens as last time -- PATCH-1 is supposed to be a safe cleanup. ]
>
> This series of patches try to optimize dma_*_from_contiguous calls:
> PATCH-1 abstracts two new functions and applies to dma-direct.c file.
> PATCH-2 saves single pages and reduce fragmentations from CMA area.
>
> Please check their commit messages for detail changelog.
>
> Nicolin Chen (2):
> dma-contiguous: Abstract dma_{alloc,free}_contiguous()
> dma-contiguous: Use fallback alloc_pages for single pages
>
> include/linux/dma-contiguous.h | 11 +++++++
> kernel/dma/contiguous.c | 57 ++++++++++++++++++++++++++++++++++
> kernel/dma/direct.c | 24 +++-----------
> 3 files changed, 72 insertions(+), 20 deletions(-)
Thanks Nicolin. Tested on a HiSilicon D06 system.
Tested-by: dann frazier <dann.frazier@canonical.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages
2019-05-24 16:16 ` Ira Weiny
@ 2019-05-27 10:55 ` Nicolin Chen
0 siblings, 0 replies; 14+ messages in thread
From: Nicolin Chen @ 2019-05-27 10:55 UTC (permalink / raw)
To: Ira Weiny
Cc: hch, robin.murphy, m.szyprowski, vdumpa, linux, catalin.marinas,
will.deacon, chris, jcmvbkbc, joro, dwmw2, tony, akpm, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
linux-kernel, linux-xtensa, iommu, dann.frazier
Hi Ira,
On Fri, May 24, 2019 at 09:16:19AM -0700, Ira Weiny wrote:
> On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote:
> > The addresses within a single page are always contiguous, so it's
> > not so necessary to always allocate one single page from CMA area.
> > Since the CMA area has a limited predefined size of space, it may
> > run out of space in heavy use cases, where there might be quite a
> > lot CMA pages being allocated for single pages.
> >
> > However, there is also a concern that a device might care where a
> > page comes from -- it might expect the page from CMA area and act
> > differently if the page doesn't.
>
> How does a device know, after this call, if a CMA area was used? From the
> patches I figured a device should not care.
A device doesn't know. But that doesn't mean a device won't care
at all. There was a concern from Robin and Christoph, as a corner
case that device might act differently if the memory isn't in its
own CMA region. That's why we let it still use its device specific
CMA area.
> > + if (dev && dev->cma_area)
> > + cma = dev->cma_area;
> > + else if (count > 1)
> > + cma = dma_contiguous_default_area;
>
> Doesn't dev_get_dma_area() already do this?
Partially yes. But unwrapping it makes the program flow clear in
my opinion. Actually I should have mentioned that this patch was
suggested by Christoph also.
Otherwise, it would need an override like:
cma = dev_get_dma_area();
if (count > 1 && cma == dma_contiguous_default_area)
cma = NULL;
Which doesn't look that bad though..
Thanks
Nicolin
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls
2019-05-24 4:06 [PATCH v3 0/2] Optimize dma_*_from_contiguous calls Nicolin Chen
` (2 preceding siblings ...)
2019-05-24 20:03 ` [PATCH v3 0/2] Optimize dma_*_from_contiguous calls dann frazier
@ 2019-05-28 6:04 ` Christoph Hellwig
2019-05-29 23:06 ` Nicolin Chen
3 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-05-28 6:04 UTC (permalink / raw)
To: Nicolin Chen
Cc: hch, robin.murphy, m.szyprowski, vdumpa, linux, catalin.marinas,
will.deacon, chris, jcmvbkbc, joro, dwmw2, tony, akpm, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
linux-kernel, linux-xtensa, iommu, dann.frazier
Thanks,
applied to dma-mapping for-next.
Can you also send a conversion of drivers/iommu/dma-iommu.c to your
new helpers against this tree?
http://git.infradead.org/users/hch/dma-mapping.git/shortlog/refs/heads/for-next
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-05-24 4:06 ` [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous() Nicolin Chen
@ 2019-05-29 18:35 ` Nathan Chancellor
2019-05-29 22:48 ` Nicolin Chen
2019-07-25 16:06 ` Ezequiel Garcia
1 sibling, 1 reply; 14+ messages in thread
From: Nathan Chancellor @ 2019-05-29 18:35 UTC (permalink / raw)
To: Nicolin Chen
Cc: hch, robin.murphy, m.szyprowski, vdumpa, linux, catalin.marinas,
will.deacon, chris, jcmvbkbc, joro, dwmw2, tony, akpm, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
linux-kernel, linux-xtensa, iommu, dann.frazier,
clang-built-linux
[-- Attachment #1: Type: text/plain, Size: 2360 bytes --]
Hi Nicolin,
On Thu, May 23, 2019 at 09:06:32PM -0700, Nicolin Chen wrote:
> Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> are very simply implemented, but requiring callers to pass certain
> parameters like count and align, and taking a boolean parameter to
> check __GFP_NOWARN in the allocation flags. So every function call
> duplicates similar work:
> /* A piece of example */
> unsigned long order = get_order(size);
> size_t count = size >> PAGE_SHIFT;
> page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> [...]
> dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
>
> Additionally, as CMA can be used only in the context which permits
> sleeping, most of callers do a gfpflags_allow_blocking() check and
> a corresponding fallback allocation of normal pages upon any false
> result:
> /* A piece of example */
> if (gfpflags_allow_blocking(flag))
> page = dma_alloc_from_contiguous();
> if (!page)
> page = alloc_pages();
> [...]
> if (!dma_release_from_contiguous(dev, page, count))
> __free_pages(page, get_order(size));
>
> So this patch simplifies those function calls by abstracting these
> operations into the two new functions: dma_{alloc,free}_contiguous.
>
> As some callers of dma_{alloc,release}_from_contiguous() might be
> complicated, this patch just implements these two new functions to
> kernel/dma/direct.c only as an initial step.
>
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> ---
This commit is causing boot failures in QEMU on x86_64 defconfig:
https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/203825363
Attached is a bisect log and a boot log with GCC (just to show it is not
a compiler thing).
My QEMU command line is:
qemu-system-x86_64 -m 512m \
-drive file=images/x86_64/rootfs.ext4,format=raw,if=ide \
-append 'console=ttyS0 root=/dev/sda' \
-nographic \
-kernel arch/x86_64/boot/bzImage
and the rootfs is available here:
https://github.com/ClangBuiltLinux/continuous-integration/raw/master/images/x86_64/rootfs.ext4
I haven't seen a report on this yet so apologize if there is already a
fix in the works. Let me know if you need anythnig from me.
Cheers,
Nathan
[-- Attachment #2: bisect.log --]
[-- Type: text/plain, Size: 2382 bytes --]
git bisect start
# good: [9fb67d643f6f1892a08ee3a04ea54022d1060bb0] Merge tag 'pinctrl-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
git bisect good 9fb67d643f6f1892a08ee3a04ea54022d1060bb0
# bad: [9a15d2e3fd03e38a6ee7d7bc34d28bb7340f05f2] Add linux-next specific files for 20190529
git bisect bad 9a15d2e3fd03e38a6ee7d7bc34d28bb7340f05f2
# bad: [eb756b9cce06b6c30b478a7ead67ddc0aa52b421] Merge remote-tracking branch 'crypto/master'
git bisect bad eb756b9cce06b6c30b478a7ead67ddc0aa52b421
# bad: [ffd5fc17ee0cd5f90258a8d4e0b87af913943f72] Merge remote-tracking branch 'xtensa/xtensa-for-next'
git bisect bad ffd5fc17ee0cd5f90258a8d4e0b87af913943f72
# bad: [df846fa0031f0e11dbfd5dd7c959cb45b3c0b3e2] Merge remote-tracking branch 'actions/for-next'
git bisect bad df846fa0031f0e11dbfd5dd7c959cb45b3c0b3e2
# good: [673e28ea2b579adcec369cd7f7295142a6b7e017] Merge remote-tracking branch 'usb.current/usb-linus'
git bisect good 673e28ea2b579adcec369cd7f7295142a6b7e017
# good: [d4f1f6efe84b20ff5f3a4874c580c2bee6cba68a] Merge remote-tracking branch 'kvms390-fixes/master'
git bisect good d4f1f6efe84b20ff5f3a4874c580c2bee6cba68a
# bad: [d0756fb0025937a5ed866d27f4452c66b066c089] Merge remote-tracking branch 'dma-mapping/for-next'
git bisect bad d0756fb0025937a5ed866d27f4452c66b066c089
# good: [49af21d0007414838a27eb4ade21277350e8ef1d] Merge remote-tracking branch 'drm-misc-fixes/for-linux-next-fixes'
git bisect good 49af21d0007414838a27eb4ade21277350e8ef1d
# good: [8680aa5a58abfe6087a3d8248c02232d3e05dc80] iommu/dma: Don't remap CMA unnecessarily
git bisect good 8680aa5a58abfe6087a3d8248c02232d3e05dc80
# good: [efd9f10b70689fdeacadc95b6e0ea6dc311fa64f] iommu/dma: Refactor iommu_dma_mmap
git bisect good efd9f10b70689fdeacadc95b6e0ea6dc311fa64f
# good: [b5f75a3639ff3b547e4eee7671e4321a429747a6] arm64: switch copyright boilerplace to SPDX in dma-mapping.c
git bisect good b5f75a3639ff3b547e4eee7671e4321a429747a6
# bad: [fdaeec198ada8c48bff03c85fab542e5b241f5bc] dma-contiguous: add dma_{alloc,free}_contiguous() helpers
git bisect bad fdaeec198ada8c48bff03c85fab542e5b241f5bc
# good: [a84cc69eb53715d37242a21ada398b0d8cd316fc] arm64: trim includes in dma-mapping.c
git bisect good a84cc69eb53715d37242a21ada398b0d8cd316fc
# first bad commit: [fdaeec198ada8c48bff03c85fab542e5b241f5bc] dma-contiguous: add dma_{alloc,free}_contiguous() helpers
[-- Attachment #3: boot.log --]
[-- Type: text/plain, Size: 22369 bytes --]
^[c^[[?7l^[[2J^[[0mSeaBIOS (version 1.12.0-20181126_142135-anatol)
iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+1FF92110+1FEF2110 C980
Press Ctrl-B to configure iPXE (PCI 00:03.0)...\r
Booting from ROM..^[c^[[?7l^[[2J^[[0m.\r[ 0.000000] Linux version 5.2.0-rc2-next-20190529 (nathan@archlinux-epyc) (gcc version 8.3.0 (GCC)) #1 SMP Wed May 29 11:26:28 MST 2019
[ 0.000000] Command line: root=/dev/sda console=ttyS0
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256
[ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001ffdffff] usable
[ 0.000000] BIOS-e820: [mem 0x000000001ffe0000-0x000000001fffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] SMBIOS 2.8 present.
[ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
[ 0.000000] tsc: Fast TSC calibration using PIT
[ 0.000000] tsc: Detected 1999.996 MHz processor
[ 0.001281] last_pfn = 0x1ffe0 max_arch_pfn = 0x400000000
[ 0.001327] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.003614] found SMP MP-table at [mem 0x000f5cc0-0x000f5ccf]
[ 0.003654] check: Scanning 1 areas for low memory corruption
[ 0.003674] Using GB pages for direct mapping
[ 0.003756] ACPI: Early table checksum verification disabled
[ 0.003788] ACPI: RSDP 0x00000000000F5AF0 000014 (v00 BOCHS )
[ 0.003793] ACPI: RSDT 0x000000001FFE156F 000030 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001)
[ 0.003798] ACPI: FACP 0x000000001FFE144B 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001)
[ 0.003802] ACPI: DSDT 0x000000001FFE0040 00140B (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001)
[ 0.003804] ACPI: FACS 0x000000001FFE0000 000040
[ 0.003806] ACPI: APIC 0x000000001FFE14BF 000078 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001)
[ 0.003808] ACPI: HPET 0x000000001FFE1537 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001)
[ 0.004087] No NUMA configuration found
[ 0.004088] Faking a node at [mem 0x0000000000000000-0x000000001ffdffff]
[ 0.004091] NODE_DATA(0) allocated [mem 0x1ffdc000-0x1ffdffff]
[ 0.004328] Zone ranges:
[ 0.004330] DMA [mem 0x0000000000001000-0x0000000000ffffff]
[ 0.004331] DMA32 [mem 0x0000000001000000-0x000000001ffdffff]
[ 0.004332] Normal empty
[ 0.004332] Movable zone start for each node
[ 0.004333] Early memory node ranges
[ 0.004334] node 0: [mem 0x0000000000001000-0x000000000009efff]
[ 0.004335] node 0: [mem 0x0000000000100000-0x000000001ffdffff]
[ 0.004612] Zeroed struct page in unavailable ranges: 98 pages
[ 0.004613] Initmem setup node 0 [mem 0x0000000000001000-0x000000001ffdffff]
[ 0.006924] ACPI: PM-Timer IO Port: 0x608
[ 0.006930] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[ 0.006958] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[ 0.006960] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[ 0.006961] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[ 0.006962] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[ 0.006963] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[ 0.006964] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[ 0.006967] Using ACPI (MADT) for SMP configuration information
[ 0.006968] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[ 0.006974] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
[ 0.006982] PM: Registered nosave memory: [mem 0x00000000-0x00000fff]
[ 0.006983] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
[ 0.006983] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
[ 0.006984] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
[ 0.006986] [mem 0x20000000-0xfeffbfff] available for PCI devices
[ 0.006989] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[ 0.100715] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:1 nr_node_ids:1
[ 0.101067] percpu: Embedded 51 pages/cpu s170968 r8192 d29736 u2097152
[ 0.101083] node[0] zonelist: 0:DMA32 0:DMA
[ 0.101086] Built 1 zonelists, mobility grouping on. Total pages: 128873
[ 0.101087] Policy zone: DMA32
[ 0.101089] Kernel command line: root=/dev/sda console=ttyS0
[ 0.102238] Memory: 486332K/523768K available (14340K kernel code, 1311K rwdata, 3128K rodata, 1308K init, 1276K bss, 37436K reserved, 0K cma-reserved)
[ 0.102591] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.102930] rcu: Hierarchical RCU implementation.
[ 0.102931] rcu: RCU event tracing is enabled.
[ 0.102932] rcu: RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=1.
[ 0.102933] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies.
[ 0.102934] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[ 0.103054] NR_IRQS: 4352, nr_irqs: 256, preallocated irqs: 16
[ 0.103367] random: get_random_bytes called from start_kernel+0x2c4/0x498 with crng_init=0
[ 0.110098] Console: colour VGA+ 80x25
[ 0.176180] printk: console [ttyS0] enabled
[ 0.176728] ACPI: Core revision 20190509
[ 0.177463] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[ 0.178714] APIC: Switch to symmetric I/O mode setup
[ 0.180399] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.185697] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x39a8554e05d, max_idle_ns: 881590540420 ns
[ 0.187046] Calibrating delay loop (skipped), value calculated using timer frequency.. 3999.99 BogoMIPS (lpj=1999996)
[ 0.188046] pid_max: default: 32768 minimum: 301
[ 0.189060] LSM: Security Framework initializing
[ 0.189653] SELinux: Initializing.
[ 0.190054] *** VALIDATE SELinux ***
[ 0.190556] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 0.191100] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[ 0.192058] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.193047] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[ 0.194137] *** VALIDATE proc ***
[ 0.195066] *** VALIDATE cgroup1 ***
[ 0.195530] *** VALIDATE cgroup2 ***
[ 0.196165] Last level iTLB entries: 4KB 512, 2MB 255, 4MB 127
[ 0.197046] Last level dTLB entries: 4KB 512, 2MB 255, 4MB 127, 1GB 0
[ 0.197853] Spectre V2 : Mitigation: Full AMD retpoline
[ 0.198045] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[ 0.199046] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
[ 0.200046] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
[ 0.205525] Freeing SMP alternatives memory: 40K
[ 0.206148] smpboot: CPU0: AMD EPYC 7401P 24-Core Processor (family: 0x17, model: 0x1, stepping: 0x2)
[ 0.207115] Performance Events: Fam17h core perfctr, AMD PMU driver.
[ 0.207932] ... version: 0
[ 0.208047] ... bit width: 48
[ 0.208573] ... generic registers: 6
[ 0.209047] ... value mask: 0000ffffffffffff
[ 0.209714] ... max period: 00007fffffffffff
[ 0.210046] ... fixed-purpose events: 0
[ 0.210556] ... event mask: 000000000000003f
[ 0.211077] rcu: Hierarchical SRCU implementation.
[ 0.211716] Decoding supported only on Scalable MCA processors.
[ 0.212065] smp: Bringing up secondary CPUs ...
[ 0.212640] smp: Brought up 1 node, 1 CPU
[ 0.213047] smpboot: Max logical packages: 1
[ 0.213588] smpboot: Total of 1 processors activated (3999.99 BogoMIPS)
[ 0.214176] devtmpfs: initialized
[ 0.214753] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[ 0.215050] futex hash table entries: 256 (order: 2, 16384 bytes)
[ 0.215897] PM: RTC time: 18:27:49, date: 2019-05-29
[ 0.216386] NET: Registered protocol family 16
[ 0.217012] audit: initializing netlink subsys (disabled)
[ 0.217253] cpuidle: using governor menu
[ 0.218179] ACPI: bus type PCI registered
[ 0.218756] PCI: Using configuration type 1 for base access
[ 0.219047] PCI: Using configuration type 1 for extended access
[ 0.220053] audit: type=2000 audit(1559154469.038:1): state=initialized audit_enabled=0 res=1
[ 0.222328] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[ 0.223111] cryptomgr_test (20) used greatest stack depth: 15520 bytes left
[ 0.224013] kworker/u2:0 (22) used greatest stack depth: 14664 bytes left
[ 0.224111] kworker/u2:0 (26) used greatest stack depth: 14216 bytes left
[ 0.225711] ACPI: Added _OSI(Module Device)
[ 0.226051] ACPI: Added _OSI(Processor Device)
[ 0.226618] ACPI: Added _OSI(3.0 _SCP Extensions)
[ 0.227047] ACPI: Added _OSI(Processor Aggregator Device)
[ 0.227737] ACPI: Added _OSI(Linux-Dell-Video)
[ 0.228071] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
[ 0.228741] ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)
[ 0.229545] ACPI: 1 ACPI AML tables successfully acquired and loaded
[ 0.231171] ACPI: Interpreter enabled
[ 0.231666] ACPI: (supports S0 S3 S4 S5)
[ 0.232048] ACPI: Using IOAPIC for interrupt routing
[ 0.232683] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[ 0.233119] ACPI: Enabled 2 GPEs in block 00 to 0F
[ 0.235370] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[ 0.236052] acpi PNP0A03:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI HPX-Type3]
[ 0.237078] PCI host bridge to bus 0000:00
[ 0.237602] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window]
[ 0.238048] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window]
[ 0.238942] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[ 0.239048] pci_bus 0000:00: root bus resource [mem 0x20000000-0xfebfffff window]
[ 0.240014] pci_bus 0000:00: root bus resource [mem 0x100000000-0x17fffffff window]
[ 0.240047] pci_bus 0000:00: root bus resource [bus 00-ff]
[ 0.240789] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[ 0.241446] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
[ 0.242459] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
[ 0.244902] pci 0000:00:01.1: reg 0x20: [io 0xc040-0xc04f]
[ 0.245838] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io 0x01f0-0x01f7]
[ 0.246049] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io 0x03f6]
[ 0.246897] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io 0x0170-0x0177]
[ 0.247047] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io 0x0376]
[ 0.248021] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
[ 0.248360] pci 0000:00:01.3: quirk: [io 0x0600-0x063f] claimed by PIIX4 ACPI
[ 0.249053] pci 0000:00:01.3: quirk: [io 0x0700-0x070f] claimed by PIIX4 SMB
[ 0.250137] pci 0000:00:02.0: [1234:1111] type 00 class 0x030000
[ 0.252059] pci 0000:00:02.0: reg 0x10: [mem 0xfd000000-0xfdffffff pref]
[ 0.256072] pci 0000:00:02.0: reg 0x18: [mem 0xfebf0000-0xfebf0fff]
[ 0.263058] pci 0000:00:02.0: reg 0x30: [mem 0xfebe0000-0xfebeffff pref]
[ 0.264378] pci 0000:00:03.0: [8086:100e] type 00 class 0x020000
[ 0.266050] pci 0000:00:03.0: reg 0x10: [mem 0xfebc0000-0xfebdffff]
[ 0.268049] pci 0000:00:03.0: reg 0x14: [io 0xc000-0xc03f]
[ 0.274058] pci 0000:00:03.0: reg 0x30: [mem 0xfeb80000-0xfebbffff pref]
[ 0.276310] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
[ 0.277301] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
[ 0.278248] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
[ 0.279247] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
[ 0.280166] ACPI: PCI Interrupt Link [LNKS] (IRQs *9)
[ 0.281410] pci 0000:00:02.0: vgaarb: setting as boot VGA device
[ 0.282044] pci 0000:00:02.0: vgaarb: VGA device added: decodes=io+mem,owns=io+mem,locks=none
[ 0.282056] pci 0000:00:02.0: vgaarb: bridge control possible
[ 0.283054] vgaarb: loaded
[ 0.284164] SCSI subsystem initialized
[ 0.285174] ACPI: bus type USB registered
[ 0.286088] usbcore: registered new interface driver usbfs
[ 0.287074] usbcore: registered new interface driver hub
[ 0.288087] usbcore: registered new device driver usb
[ 0.289088] pps_core: LinuxPPS API ver. 1 registered
[ 0.290054] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[ 0.291064] PTP clock support registered
[ 0.292127] EDAC MC: Ver: 3.0.0
[ 0.293326] Advanced Linux Sound Architecture Driver Initialized.
[ 0.294072] PCI: Using ACPI for IRQ routing
[ 0.295369] NetLabel: Initializing
[ 0.296052] NetLabel: domain hash size = 128
[ 0.297050] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO
[ 0.298079] NetLabel: unlabeled traffic allowed by default
[ 0.299200] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
[ 0.300053] hpet0: 3 comparators, 64-bit 100.000000 MHz counter
[ 0.306070] clocksource: Switched to clocksource tsc-early
[ 0.379516] VFS: Disk quotas dquot_6.6.0
[ 0.380072] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
[ 0.380975] *** VALIDATE hugetlbfs ***
[ 0.381470] pnp: PnP ACPI init
[ 0.382179] pnp: PnP ACPI: found 6 devices
[ 0.388273] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
[ 0.389427] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window]
[ 0.390251] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window]
[ 0.391054] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[ 0.391914] pci_bus 0000:00: resource 7 [mem 0x20000000-0xfebfffff window]
[ 0.392777] pci_bus 0000:00: resource 8 [mem 0x100000000-0x17fffffff window]
[ 0.393680] NET: Registered protocol family 2
[ 0.394330] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes)
[ 0.395302] TCP established hash table entries: 4096 (order: 3, 32768 bytes)
[ 0.396246] TCP bind hash table entries: 4096 (order: 4, 65536 bytes)
[ 0.397089] TCP: Hash tables configured (established 4096 bind 4096)
[ 0.397884] UDP hash table entries: 256 (order: 1, 8192 bytes)
[ 0.398624] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[ 0.399421] NET: Registered protocol family 1
[ 0.400031] RPC: Registered named UNIX socket transport module.
[ 0.400773] RPC: Registered udp transport module.
[ 0.401367] RPC: Registered tcp transport module.
[ 0.401958] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 0.402824] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[ 0.403549] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[ 0.404288] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[ 0.405080] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
[ 0.406129] PCI: CLS 0 bytes, default 64
[ 0.406744] check: Scanning for low memory corruption every 60 seconds
[ 0.407727] Initialise system trusted keyrings
[ 0.408320] workingset: timestamp_bits=56 max_order=17 bucket_order=0
[ 0.410320] NFS: Registering the id_resolver key type
[ 0.410953] Key type id_resolver registered
[ 0.411481] Key type id_legacy registered
[ 0.412107] Key type asymmetric registered
[ 0.412623] Asymmetric key parser 'x509' registered
[ 0.413235] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)
[ 0.414152] io scheduler mq-deadline registered
[ 0.414715] io scheduler kyber registered
[ 0.415305] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[ 0.416234] ACPI: Power Button [PWRF]
[ 0.416819] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 0.440750] 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 0.441978] Non-volatile memory driver v1.3
[ 0.442575] Linux agpgart interface v0.103
[ 0.444240] loop: module loaded
[ 0.444948] ata_piix 0000:00:01.1: failed to start port 0 (errno=-12)
[ 0.445938] ata_piix: probe of 0000:00:01.1 failed with error -12
[ 0.446748] e100: Intel(R) PRO/100 Network Driver, 3.5.24-k2-NAPI
[ 0.447504] e100: Copyright(c) 1999-2006 Intel Corporation
[ 0.448196] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 0.449116] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 0.462026] PCI Interrupt Link [LNKC] enabled at IRQ 11
[ 0.764450] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) 52:54:00:12:34:56
[ 0.765305] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network Connection
[ 0.766192] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[ 0.766955] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 0.767732] sky2: driver version 1.30
[ 0.768275] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 0.769128] ehci-pci: EHCI PCI platform driver
[ 0.769710] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 0.770503] ohci-pci: OHCI PCI platform driver
[ 0.771081] uhci_hcd: USB Universal Host Controller Interface driver
[ 0.771910] usbcore: registered new interface driver usblp
[ 0.772628] usbcore: registered new interface driver usb-storage
[ 0.773424] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
[ 0.775080] serio: i8042 KBD port at 0x60,0x64 irq 1
[ 0.775729] serio: i8042 AUX port at 0x60,0x64 irq 12
[ 0.776961] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1
[ 0.779751] rtc_cmos 00:00: RTC can wake from S4
[ 0.780682] rtc_cmos 00:00: registered as rtc0
[ 0.781298] rtc_cmos 00:00: alarms up to one day, y3k, 114 bytes nvram, hpet irqs
[ 0.782350] device-mapper: ioctl: 4.40.0-ioctl (2019-01-18) initialised: dm-devel@redhat.com
[ 0.783478] hidraw: raw HID events driver (C) Jiri Kosina
[ 0.784247] usbcore: registered new interface driver usbhid
[ 0.784940] usbhid: USB HID core driver
[ 0.785877] Initializing XFRM netlink socket
[ 0.786497] NET: Registered protocol family 10
[ 0.787219] Segment Routing with IPv6
[ 0.787713] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[ 0.788537] NET: Registered protocol family 17
[ 0.789122] Key type dns_resolver registered
[ 0.789708] mce: Using 10 MCE banks
[ 0.790165] sched_clock: Marking stable (714371390, 75773427)->(925568491, -135423674)
[ 0.791221] registered taskstats version 1
[ 0.791733] Loading compiled-in X.509 certificates
[ 0.792509] PM: Magic number: 3:11:495
[ 0.793015] printk: console [netcon0] enabled
[ 0.793565] netconsole: network logging started
[ 0.794166] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 0.795403] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 0.796230] ALSA device list:
[ 0.796679] No soundcards found.
[ 0.797386] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 0.798530] cfg80211: failed to load regulatory.db
[ 1.380351] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input3
[ 1.383202] md: Waiting for all devices to be available before autodetect
[ 1.385280] md: If you don't use raid, use raid=noautodetect
[ 1.387215] md: Autodetecting RAID arrays.
[ 1.388498] md: autorun ...
[ 1.389381] md: ... autorun DONE.
[ 1.390461] VFS: Cannot open root device "sda" or unknown-block(0,0): error -6
[ 1.392649] Please append a correct "root=" boot option; here are the available partitions:
[ 1.395141] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 1.397600] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc2-next-20190529 #1
[ 1.399793] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
[ 1.402816] Call Trace:
[ 1.403577] dump_stack+0x46/0x60
[ 1.404586] panic+0xf6/0x2b7
[ 1.405512] mount_block_root+0x191/0x23b
[ 1.406724] ? do_early_param+0x89/0x89
[ 1.407881] mount_root+0x10a/0x128
[ 1.408952] prepare_namespace+0x130/0x166
[ 1.410195] kernel_init_freeable+0x1df/0x1ea
[ 1.411501] ? rest_init+0x9a/0x9a
[ 1.412533] kernel_init+0x5/0xf6
[ 1.413536] ret_from_fork+0x22/0x40
[ 1.417612] Kernel Offset: 0x800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 1.420749] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-05-29 18:35 ` Nathan Chancellor
@ 2019-05-29 22:48 ` Nicolin Chen
0 siblings, 0 replies; 14+ messages in thread
From: Nicolin Chen @ 2019-05-29 22:48 UTC (permalink / raw)
To: Nathan Chancellor
Cc: hch, robin.murphy, m.szyprowski, vdumpa, linux, catalin.marinas,
will.deacon, chris, jcmvbkbc, joro, dwmw2, tony, akpm, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
linux-kernel, linux-xtensa, iommu, dann.frazier,
clang-built-linux
Hi Nathan,
On Wed, May 29, 2019 at 11:35:46AM -0700, Nathan Chancellor wrote:
> This commit is causing boot failures in QEMU on x86_64 defconfig:
>
> https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/203825363
>
> Attached is a bisect log and a boot log with GCC (just to show it is not
> a compiler thing).
>
> My QEMU command line is:
>
> qemu-system-x86_64 -m 512m \
> -drive file=images/x86_64/rootfs.ext4,format=raw,if=ide \
> -append 'console=ttyS0 root=/dev/sda' \
> -nographic \
> -kernel arch/x86_64/boot/bzImage
>
> and the rootfs is available here:
>
> https://github.com/ClangBuiltLinux/continuous-integration/raw/master/images/x86_64/rootfs.ext4
Thanks for reporting the bug.
I am able to repro the issue with the given command and rootfs. The
problem is that x86_64 has CONFIG_DMA_CMA=n so the helper function
is blank on x86_64 while dma-direct should be platform independent.
A simple fix is to add alloc_pages_node() for !CONFIG_DMA_CMA. I'll
submit a fix soon -- need to figure out a good way though. It seems
that adding the fallback to the !CONFIG_DMA_CMA version would cause
some recipe errors when building the kernel...
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls
2019-05-28 6:04 ` Christoph Hellwig
@ 2019-05-29 23:06 ` Nicolin Chen
0 siblings, 0 replies; 14+ messages in thread
From: Nicolin Chen @ 2019-05-29 23:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: robin.murphy, m.szyprowski, vdumpa, linux, catalin.marinas,
will.deacon, chris, jcmvbkbc, joro, dwmw2, tony, akpm, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
linux-kernel, linux-xtensa, iommu, dann.frazier
On Tue, May 28, 2019 at 08:04:24AM +0200, Christoph Hellwig wrote:
> Thanks,
>
> applied to dma-mapping for-next.
>
> Can you also send a conversion of drivers/iommu/dma-iommu.c to your
> new helpers against this tree?
>
> http://git.infradead.org/users/hch/dma-mapping.git/shortlog/refs/heads/for-next
I can. There is a reported regression with !CONFIG_DMA_CMA now
so I will do that after a fix is merged and the whole thing is
stable.
Thank you
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-05-24 4:06 ` [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous() Nicolin Chen
2019-05-29 18:35 ` Nathan Chancellor
@ 2019-07-25 16:06 ` Ezequiel Garcia
2019-07-25 16:50 ` Nicolin Chen
1 sibling, 1 reply; 14+ messages in thread
From: Ezequiel Garcia @ 2019-07-25 16:06 UTC (permalink / raw)
To: Nicolin Chen
Cc: hch, robin.murphy, Marek Szyprowski, vdumpa, Russell King,
Catalin Marinas, Will Deacon, chris, jcmvbkbc, joro,
David Woodhouse, Tony Lindgren, Andrew Morton, sfr, treding,
keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
Linux Kernel Mailing List, linux-xtensa, iommu, dann.frazier,
dafna.hirschfeld
I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
who found a regression caused by this commit. Dafna, can you give all
the details, including the log and how you are reproducing it?
On Fri, 24 May 2019 at 01:08, Nicolin Chen <nicoleotsuka@gmail.com> wrote:
>
> Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> are very simply implemented, but requiring callers to pass certain
> parameters like count and align, and taking a boolean parameter to
> check __GFP_NOWARN in the allocation flags. So every function call
> duplicates similar work:
> /* A piece of example */
> unsigned long order = get_order(size);
> size_t count = size >> PAGE_SHIFT;
> page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> [...]
> dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
>
> Additionally, as CMA can be used only in the context which permits
> sleeping, most of callers do a gfpflags_allow_blocking() check and
> a corresponding fallback allocation of normal pages upon any false
> result:
> /* A piece of example */
> if (gfpflags_allow_blocking(flag))
> page = dma_alloc_from_contiguous();
> if (!page)
> page = alloc_pages();
> [...]
> if (!dma_release_from_contiguous(dev, page, count))
> __free_pages(page, get_order(size));
>
> So this patch simplifies those function calls by abstracting these
> operations into the two new functions: dma_{alloc,free}_contiguous.
>
> As some callers of dma_{alloc,release}_from_contiguous() might be
> complicated, this patch just implements these two new functions to
> kernel/dma/direct.c only as an initial step.
>
> Suggested-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> ---
> Changelog
> v2->v3:
> * Added missing "static inline" in header file to fix build error.
> v1->v2:
> * Added new functions beside the old ones so we can replace callers
> one by one later.
> * Applied new functions to dma/direct.c only, because it's the best
> example caller to apply and should be safe with the new functions.
>
> include/linux/dma-contiguous.h | 11 ++++++++
> kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
> kernel/dma/direct.c | 24 +++--------------
> 3 files changed, 63 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> index f247e8aa5e3d..00a370c1c140 100644
> --- a/include/linux/dma-contiguous.h
> +++ b/include/linux/dma-contiguous.h
> @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> unsigned int order, bool no_warn);
> bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> int count);
> +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
>
> #else
>
> @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> return false;
> }
>
> +static inline
> +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> +{
> + return NULL;
> +}
> +
> +static inline
> +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> +
> #endif
>
> #endif
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index b2a87905846d..21f39a6cb04f 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> return cma_release(dev_get_cma_area(dev), pages, count);
> }
>
> +/**
> + * dma_alloc_contiguous() - allocate contiguous pages
> + * @dev: Pointer to device for which the allocation is performed.
> + * @size: Requested allocation size.
> + * @gfp: Allocation flags.
> + *
> + * This function allocates contiguous memory buffer for specified device. It
> + * first tries to use device specific contiguous memory area if available or
> + * the default global one, then tries a fallback allocation of normal pages.
> + */
> +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> +{
> + int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> + size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> + size_t align = get_order(PAGE_ALIGN(size));
> + struct cma *cma = dev_get_cma_area(dev);
> + struct page *page = NULL;
> +
> + /* CMA can be used only in the context which permits sleeping */
> + if (cma && gfpflags_allow_blocking(gfp)) {
> + align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> + page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> + }
> +
> + /* Fallback allocation of normal pages */
> + if (!page)
> + page = alloc_pages_node(node, gfp, align);
> +
> + return page;
> +}
> +
> +/**
> + * dma_free_contiguous() - release allocated pages
> + * @dev: Pointer to device for which the pages were allocated.
> + * @page: Pointer to the allocated pages.
> + * @size: Size of allocated pages.
> + *
> + * This function releases memory allocated by dma_alloc_contiguous(). As the
> + * cma_release returns false when provided pages do not belong to contiguous
> + * area and true otherwise, this function then does a fallback __free_pages()
> + * upon a false-return.
> + */
> +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> +{
> + if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> + __free_pages(page, get_order(size));
> +}
> +
> /*
> * Support for reserved memory regions defined in device tree
> */
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 2c2772e9702a..0816c1e8b05a 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> {
> - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> - int page_order = get_order(size);
> struct page *page = NULL;
> u64 phys_mask;
>
> @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> &phys_mask);
> again:
> - /* CMA can be used only in the context which permits sleeping */
> - if (gfpflags_allow_blocking(gfp)) {
> - page = dma_alloc_from_contiguous(dev, count, page_order,
> - gfp & __GFP_NOWARN);
> - if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> - dma_release_from_contiguous(dev, page, count);
> - page = NULL;
> - }
> - }
> - if (!page)
> - page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> -
> + page = dma_alloc_contiguous(dev, size, gfp);
> if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> - __free_pages(page, page_order);
> + dma_free_contiguous(dev, page, size);
> page = NULL;
>
> if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> if (PageHighMem(page)) {
> /*
> * Depending on the cma= arguments and per-arch setup
> - * dma_alloc_from_contiguous could return highmem pages.
> + * dma_alloc_contiguous could return highmem pages.
> * Without remapping there is no way to return them here,
> * so log an error and fail.
> */
> @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
>
> void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> {
> - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -
> - if (!dma_release_from_contiguous(dev, page, count))
> - __free_pages(page, get_order(size));
> + dma_free_contiguous(dev, page, size);
> }
>
> void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-07-25 16:06 ` Ezequiel Garcia
@ 2019-07-25 16:50 ` Nicolin Chen
2019-07-25 17:31 ` Dafna Hirschfeld
0 siblings, 1 reply; 14+ messages in thread
From: Nicolin Chen @ 2019-07-25 16:50 UTC (permalink / raw)
To: Ezequiel Garcia
Cc: hch, robin.murphy, Marek Szyprowski, vdumpa, Russell King,
Catalin Marinas, Will Deacon, chris, jcmvbkbc, joro,
David Woodhouse, Tony Lindgren, Andrew Morton, sfr, treding,
keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
Linux Kernel Mailing List, linux-xtensa, iommu, dann.frazier,
dafna.hirschfeld
On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> who found a regression caused by this commit. Dafna, can you give all
> the details, including the log and how you are reproducing it?
I saw the conversation there. Sorry for not replying yet.
May we discuss there since there are full logs available?
Thanks
Nicolin
>
>
> On Fri, 24 May 2019 at 01:08, Nicolin Chen <nicoleotsuka@gmail.com> wrote:
> >
> > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > are very simply implemented, but requiring callers to pass certain
> > parameters like count and align, and taking a boolean parameter to
> > check __GFP_NOWARN in the allocation flags. So every function call
> > duplicates similar work:
> > /* A piece of example */
> > unsigned long order = get_order(size);
> > size_t count = size >> PAGE_SHIFT;
> > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > [...]
> > dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> >
> > Additionally, as CMA can be used only in the context which permits
> > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > a corresponding fallback allocation of normal pages upon any false
> > result:
> > /* A piece of example */
> > if (gfpflags_allow_blocking(flag))
> > page = dma_alloc_from_contiguous();
> > if (!page)
> > page = alloc_pages();
> > [...]
> > if (!dma_release_from_contiguous(dev, page, count))
> > __free_pages(page, get_order(size));
> >
> > So this patch simplifies those function calls by abstracting these
> > operations into the two new functions: dma_{alloc,free}_contiguous.
> >
> > As some callers of dma_{alloc,release}_from_contiguous() might be
> > complicated, this patch just implements these two new functions to
> > kernel/dma/direct.c only as an initial step.
> >
> > Suggested-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> > ---
> > Changelog
> > v2->v3:
> > * Added missing "static inline" in header file to fix build error.
> > v1->v2:
> > * Added new functions beside the old ones so we can replace callers
> > one by one later.
> > * Applied new functions to dma/direct.c only, because it's the best
> > example caller to apply and should be safe with the new functions.
> >
> > include/linux/dma-contiguous.h | 11 ++++++++
> > kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
> > kernel/dma/direct.c | 24 +++--------------
> > 3 files changed, 63 insertions(+), 20 deletions(-)
> >
> > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> > index f247e8aa5e3d..00a370c1c140 100644
> > --- a/include/linux/dma-contiguous.h
> > +++ b/include/linux/dma-contiguous.h
> > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> > unsigned int order, bool no_warn);
> > bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > int count);
> > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
> >
> > #else
> >
> > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > return false;
> > }
> >
> > +static inline
> > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > +{
> > + return NULL;
> > +}
> > +
> > +static inline
> > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> > +
> > #endif
> >
> > #endif
> > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > index b2a87905846d..21f39a6cb04f 100644
> > --- a/kernel/dma/contiguous.c
> > +++ b/kernel/dma/contiguous.c
> > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > return cma_release(dev_get_cma_area(dev), pages, count);
> > }
> >
> > +/**
> > + * dma_alloc_contiguous() - allocate contiguous pages
> > + * @dev: Pointer to device for which the allocation is performed.
> > + * @size: Requested allocation size.
> > + * @gfp: Allocation flags.
> > + *
> > + * This function allocates contiguous memory buffer for specified device. It
> > + * first tries to use device specific contiguous memory area if available or
> > + * the default global one, then tries a fallback allocation of normal pages.
> > + */
> > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > +{
> > + int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> > + size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > + size_t align = get_order(PAGE_ALIGN(size));
> > + struct cma *cma = dev_get_cma_area(dev);
> > + struct page *page = NULL;
> > +
> > + /* CMA can be used only in the context which permits sleeping */
> > + if (cma && gfpflags_allow_blocking(gfp)) {
> > + align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> > + page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> > + }
> > +
> > + /* Fallback allocation of normal pages */
> > + if (!page)
> > + page = alloc_pages_node(node, gfp, align);
> > +
> > + return page;
> > +}
> > +
> > +/**
> > + * dma_free_contiguous() - release allocated pages
> > + * @dev: Pointer to device for which the pages were allocated.
> > + * @page: Pointer to the allocated pages.
> > + * @size: Size of allocated pages.
> > + *
> > + * This function releases memory allocated by dma_alloc_contiguous(). As the
> > + * cma_release returns false when provided pages do not belong to contiguous
> > + * area and true otherwise, this function then does a fallback __free_pages()
> > + * upon a false-return.
> > + */
> > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > +{
> > + if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> > + __free_pages(page, get_order(size));
> > +}
> > +
> > /*
> > * Support for reserved memory regions defined in device tree
> > */
> > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > index 2c2772e9702a..0816c1e8b05a 100644
> > --- a/kernel/dma/direct.c
> > +++ b/kernel/dma/direct.c
> > @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> > struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> > {
> > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > - int page_order = get_order(size);
> > struct page *page = NULL;
> > u64 phys_mask;
> >
> > @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> > &phys_mask);
> > again:
> > - /* CMA can be used only in the context which permits sleeping */
> > - if (gfpflags_allow_blocking(gfp)) {
> > - page = dma_alloc_from_contiguous(dev, count, page_order,
> > - gfp & __GFP_NOWARN);
> > - if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > - dma_release_from_contiguous(dev, page, count);
> > - page = NULL;
> > - }
> > - }
> > - if (!page)
> > - page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> > -
> > + page = dma_alloc_contiguous(dev, size, gfp);
> > if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > - __free_pages(page, page_order);
> > + dma_free_contiguous(dev, page, size);
> > page = NULL;
> >
> > if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> > @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > if (PageHighMem(page)) {
> > /*
> > * Depending on the cma= arguments and per-arch setup
> > - * dma_alloc_from_contiguous could return highmem pages.
> > + * dma_alloc_contiguous could return highmem pages.
> > * Without remapping there is no way to return them here,
> > * so log an error and fail.
> > */
> > @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> >
> > void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> > {
> > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > -
> > - if (!dma_release_from_contiguous(dev, page, count))
> > - __free_pages(page, get_order(size));
> > + dma_free_contiguous(dev, page, size);
> > }
> >
> > void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> > --
> > 2.17.1
> >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-07-25 16:50 ` Nicolin Chen
@ 2019-07-25 17:31 ` Dafna Hirschfeld
2019-07-25 23:42 ` Nicolin Chen
0 siblings, 1 reply; 14+ messages in thread
From: Dafna Hirschfeld @ 2019-07-25 17:31 UTC (permalink / raw)
To: Nicolin Chen, Ezequiel Garcia
Cc: hch, robin.murphy, Marek Szyprowski, vdumpa, Russell King,
Catalin Marinas, Will Deacon, chris, jcmvbkbc, joro,
David Woodhouse, Tony Lindgren, Andrew Morton, sfr, treding,
keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
Linux Kernel Mailing List, linux-xtensa, iommu, dann.frazier,
linux-media, hans.verkuil
On Thu, 2019-07-25 at 09:50 -0700, Nicolin Chen wrote:
> On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> > I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> > who found a regression caused by this commit. Dafna, can you give all
> > the details, including the log and how you are reproducing it?
>
> I saw the conversation there. Sorry for not replying yet.
> May we discuss there since there are full logs available?
>
> Thanks
> Nicolin
>
Hi,
I compiled vivid as built-in into the kernel (not as a separate module) for nitrogen8m device (imx8) and
set it to use contig dma for mem_ops by adding the kernel param
vivid.allocators=1,1,...
I use this devicetree patch for the dtb file: https://lkml.org/lkml/2019/7/24/789. Although it should
be the same on any Aarch64 platform.
Then, on the board I run the command:
v4l2-ctl -d3 -v width=2592,height=1944,pixelformat=UYVY,bytesperline=5184 --stream-mmap --stream-to video.UYVY
In every run there is a different crash. Here is one of them: https://pastebin.com/xXgbXMAN
Dafna
> >
> >
> > > > On Fri, 24 May 2019 at 01:08, Nicolin Chen <nicoleotsuka@gmail.com> wrote:
> > >
> > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > > are very simply implemented, but requiring callers to pass certain
> > > parameters like count and align, and taking a boolean parameter to
> > > check __GFP_NOWARN in the allocation flags. So every function call
> > > duplicates similar work:
> > > /* A piece of example */
> > > unsigned long order = get_order(size);
> > > size_t count = size >> PAGE_SHIFT;
> > > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > > [...]
> > > dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> > >
> > > Additionally, as CMA can be used only in the context which permits
> > > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > > a corresponding fallback allocation of normal pages upon any false
> > > result:
> > > /* A piece of example */
> > > if (gfpflags_allow_blocking(flag))
> > > page = dma_alloc_from_contiguous();
> > > if (!page)
> > > page = alloc_pages();
> > > [...]
> > > if (!dma_release_from_contiguous(dev, page, count))
> > > __free_pages(page, get_order(size));
> > >
> > > So this patch simplifies those function calls by abstracting these
> > > operations into the two new functions: dma_{alloc,free}_contiguous.
> > >
> > > As some callers of dma_{alloc,release}_from_contiguous() might be
> > > complicated, this patch just implements these two new functions to
> > > kernel/dma/direct.c only as an initial step.
> > >
> > > > > > Suggested-by: Christoph Hellwig <hch@lst.de>
> > > > > > Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> > > ---
> > > Changelog
> > > v2->v3:
> > > * Added missing "static inline" in header file to fix build error.
> > > v1->v2:
> > > * Added new functions beside the old ones so we can replace callers
> > > one by one later.
> > > * Applied new functions to dma/direct.c only, because it's the best
> > > example caller to apply and should be safe with the new functions.
> > >
> > > include/linux/dma-contiguous.h | 11 ++++++++
> > > kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
> > > kernel/dma/direct.c | 24 +++--------------
> > > 3 files changed, 63 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> > > index f247e8aa5e3d..00a370c1c140 100644
> > > --- a/include/linux/dma-contiguous.h
> > > +++ b/include/linux/dma-contiguous.h
> > > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> > > unsigned int order, bool no_warn);
> > > bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > int count);
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
> > >
> > > #else
> > >
> > > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > return false;
> > > }
> > >
> > > +static inline
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > +{
> > > + return NULL;
> > > +}
> > > +
> > > +static inline
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> > > +
> > > #endif
> > >
> > > #endif
> > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > index b2a87905846d..21f39a6cb04f 100644
> > > --- a/kernel/dma/contiguous.c
> > > +++ b/kernel/dma/contiguous.c
> > > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > return cma_release(dev_get_cma_area(dev), pages, count);
> > > }
> > >
> > > +/**
> > > + * dma_alloc_contiguous() - allocate contiguous pages
> > > + * @dev: Pointer to device for which the allocation is performed.
> > > + * @size: Requested allocation size.
> > > + * @gfp: Allocation flags.
> > > + *
> > > + * This function allocates contiguous memory buffer for specified device. It
> > > + * first tries to use device specific contiguous memory area if available or
> > > + * the default global one, then tries a fallback allocation of normal pages.
> > > + */
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > +{
> > > + int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> > > + size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > + size_t align = get_order(PAGE_ALIGN(size));
> > > + struct cma *cma = dev_get_cma_area(dev);
> > > + struct page *page = NULL;
> > > +
> > > + /* CMA can be used only in the context which permits sleeping */
> > > + if (cma && gfpflags_allow_blocking(gfp)) {
> > > + align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> > > + page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> > > + }
> > > +
> > > + /* Fallback allocation of normal pages */
> > > + if (!page)
> > > + page = alloc_pages_node(node, gfp, align);
> > > +
> > > + return page;
> > > +}
> > > +
> > > +/**
> > > + * dma_free_contiguous() - release allocated pages
> > > + * @dev: Pointer to device for which the pages were allocated.
> > > + * @page: Pointer to the allocated pages.
> > > + * @size: Size of allocated pages.
> > > + *
> > > + * This function releases memory allocated by dma_alloc_contiguous(). As the
> > > + * cma_release returns false when provided pages do not belong to contiguous
> > > + * area and true otherwise, this function then does a fallback __free_pages()
> > > + * upon a false-return.
> > > + */
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > > +{
> > > + if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> > > + __free_pages(page, get_order(size));
> > > +}
> > > +
> > > /*
> > > * Support for reserved memory regions defined in device tree
> > > */
> > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > > index 2c2772e9702a..0816c1e8b05a 100644
> > > --- a/kernel/dma/direct.c
> > > +++ b/kernel/dma/direct.c
> > > @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> > > struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> > > {
> > > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > - int page_order = get_order(size);
> > > struct page *page = NULL;
> > > u64 phys_mask;
> > >
> > > @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> > > &phys_mask);
> > > again:
> > > - /* CMA can be used only in the context which permits sleeping */
> > > - if (gfpflags_allow_blocking(gfp)) {
> > > - page = dma_alloc_from_contiguous(dev, count, page_order,
> > > - gfp & __GFP_NOWARN);
> > > - if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > - dma_release_from_contiguous(dev, page, count);
> > > - page = NULL;
> > > - }
> > > - }
> > > - if (!page)
> > > - page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> > > -
> > > + page = dma_alloc_contiguous(dev, size, gfp);
> > > if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > - __free_pages(page, page_order);
> > > + dma_free_contiguous(dev, page, size);
> > > page = NULL;
> > >
> > > if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> > > @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > > if (PageHighMem(page)) {
> > > /*
> > > * Depending on the cma= arguments and per-arch setup
> > > - * dma_alloc_from_contiguous could return highmem pages.
> > > + * dma_alloc_contiguous could return highmem pages.
> > > * Without remapping there is no way to return them here,
> > > * so log an error and fail.
> > > */
> > > @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > >
> > > void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> > > {
> > > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > -
> > > - if (!dma_release_from_contiguous(dev, page, count))
> > > - __free_pages(page, get_order(size));
> > > + dma_free_contiguous(dev, page, size);
> > > }
> > >
> > > void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> > > --
> > > 2.17.1
> > >
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
2019-07-25 17:31 ` Dafna Hirschfeld
@ 2019-07-25 23:42 ` Nicolin Chen
0 siblings, 0 replies; 14+ messages in thread
From: Nicolin Chen @ 2019-07-25 23:42 UTC (permalink / raw)
To: Dafna Hirschfeld
Cc: Ezequiel Garcia, hch, robin.murphy, Marek Szyprowski, vdumpa,
Russell King, Catalin Marinas, Will Deacon, chris, jcmvbkbc,
joro, David Woodhouse, Tony Lindgren, Andrew Morton, sfr,
treding, keescook, iamjoonsoo.kim, wsa+renesas, linux-arm-kernel,
Linux Kernel Mailing List, linux-xtensa, iommu, dann.frazier,
linux-media, hans.verkuil
On Thu, Jul 25, 2019 at 07:31:05PM +0200, Dafna Hirschfeld wrote:
> On Thu, 2019-07-25 at 09:50 -0700, Nicolin Chen wrote:
> > On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> > > I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> > > who found a regression caused by this commit. Dafna, can you give all
> > > the details, including the log and how you are reproducing it?
> >
> > I saw the conversation there. Sorry for not replying yet.
> > May we discuss there since there are full logs available?
> >
> > Thanks
> > Nicolin
> >
>
> Hi,
> I compiled vivid as built-in into the kernel (not as a separate module) for nitrogen8m device (imx8) and
> set it to use contig dma for mem_ops by adding the kernel param
> vivid.allocators=1,1,...
>
> I use this devicetree patch for the dtb file: https://lkml.org/lkml/2019/7/24/789. Although it should
> be the same on any Aarch64 platform.
>
> Then, on the board I run the command:
>
> v4l2-ctl -d3 -v width=2592,height=1944,pixelformat=UYVY,bytesperline=5184 --stream-mmap --stream-to video.UYVY
>
> In every run there is a different crash. Here is one of them: https://pastebin.com/xXgbXMAN
This probably should be a fix: https://lkml.org/lkml/2019/7/25/1432
I also sent it to you. Would it be possible for you to test it?
Thanks
Nicolin
>
> Dafna
> > >
> > >
> > > > > On Fri, 24 May 2019 at 01:08, Nicolin Chen <nicoleotsuka@gmail.com> wrote:
> > > >
> > > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > > > are very simply implemented, but requiring callers to pass certain
> > > > parameters like count and align, and taking a boolean parameter to
> > > > check __GFP_NOWARN in the allocation flags. So every function call
> > > > duplicates similar work:
> > > > /* A piece of example */
> > > > unsigned long order = get_order(size);
> > > > size_t count = size >> PAGE_SHIFT;
> > > > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > > > [...]
> > > > dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> > > >
> > > > Additionally, as CMA can be used only in the context which permits
> > > > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > > > a corresponding fallback allocation of normal pages upon any false
> > > > result:
> > > > /* A piece of example */
> > > > if (gfpflags_allow_blocking(flag))
> > > > page = dma_alloc_from_contiguous();
> > > > if (!page)
> > > > page = alloc_pages();
> > > > [...]
> > > > if (!dma_release_from_contiguous(dev, page, count))
> > > > __free_pages(page, get_order(size));
> > > >
> > > > So this patch simplifies those function calls by abstracting these
> > > > operations into the two new functions: dma_{alloc,free}_contiguous.
> > > >
> > > > As some callers of dma_{alloc,release}_from_contiguous() might be
> > > > complicated, this patch just implements these two new functions to
> > > > kernel/dma/direct.c only as an initial step.
> > > >
> > > > > > > Suggested-by: Christoph Hellwig <hch@lst.de>
> > > > > > > Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
> > > > ---
> > > > Changelog
> > > > v2->v3:
> > > > * Added missing "static inline" in header file to fix build error.
> > > > v1->v2:
> > > > * Added new functions beside the old ones so we can replace callers
> > > > one by one later.
> > > > * Applied new functions to dma/direct.c only, because it's the best
> > > > example caller to apply and should be safe with the new functions.
> > > >
> > > > include/linux/dma-contiguous.h | 11 ++++++++
> > > > kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
> > > > kernel/dma/direct.c | 24 +++--------------
> > > > 3 files changed, 63 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> > > > index f247e8aa5e3d..00a370c1c140 100644
> > > > --- a/include/linux/dma-contiguous.h
> > > > +++ b/include/linux/dma-contiguous.h
> > > > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> > > > unsigned int order, bool no_warn);
> > > > bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > > int count);
> > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
> > > >
> > > > #else
> > > >
> > > > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > > return false;
> > > > }
> > > >
> > > > +static inline
> > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > > +{
> > > > + return NULL;
> > > > +}
> > > > +
> > > > +static inline
> > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> > > > +
> > > > #endif
> > > >
> > > > #endif
> > > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > > index b2a87905846d..21f39a6cb04f 100644
> > > > --- a/kernel/dma/contiguous.c
> > > > +++ b/kernel/dma/contiguous.c
> > > > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > > return cma_release(dev_get_cma_area(dev), pages, count);
> > > > }
> > > >
> > > > +/**
> > > > + * dma_alloc_contiguous() - allocate contiguous pages
> > > > + * @dev: Pointer to device for which the allocation is performed.
> > > > + * @size: Requested allocation size.
> > > > + * @gfp: Allocation flags.
> > > > + *
> > > > + * This function allocates contiguous memory buffer for specified device. It
> > > > + * first tries to use device specific contiguous memory area if available or
> > > > + * the default global one, then tries a fallback allocation of normal pages.
> > > > + */
> > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > > +{
> > > > + int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> > > > + size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > > + size_t align = get_order(PAGE_ALIGN(size));
> > > > + struct cma *cma = dev_get_cma_area(dev);
> > > > + struct page *page = NULL;
> > > > +
> > > > + /* CMA can be used only in the context which permits sleeping */
> > > > + if (cma && gfpflags_allow_blocking(gfp)) {
> > > > + align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> > > > + page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> > > > + }
> > > > +
> > > > + /* Fallback allocation of normal pages */
> > > > + if (!page)
> > > > + page = alloc_pages_node(node, gfp, align);
> > > > +
> > > > + return page;
> > > > +}
> > > > +
> > > > +/**
> > > > + * dma_free_contiguous() - release allocated pages
> > > > + * @dev: Pointer to device for which the pages were allocated.
> > > > + * @page: Pointer to the allocated pages.
> > > > + * @size: Size of allocated pages.
> > > > + *
> > > > + * This function releases memory allocated by dma_alloc_contiguous(). As the
> > > > + * cma_release returns false when provided pages do not belong to contiguous
> > > > + * area and true otherwise, this function then does a fallback __free_pages()
> > > > + * upon a false-return.
> > > > + */
> > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > > > +{
> > > > + if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> > > > + __free_pages(page, get_order(size));
> > > > +}
> > > > +
> > > > /*
> > > > * Support for reserved memory regions defined in device tree
> > > > */
> > > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > > > index 2c2772e9702a..0816c1e8b05a 100644
> > > > --- a/kernel/dma/direct.c
> > > > +++ b/kernel/dma/direct.c
> > > > @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> > > > struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > > dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> > > > {
> > > > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > > - int page_order = get_order(size);
> > > > struct page *page = NULL;
> > > > u64 phys_mask;
> > > >
> > > > @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > > gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> > > > &phys_mask);
> > > > again:
> > > > - /* CMA can be used only in the context which permits sleeping */
> > > > - if (gfpflags_allow_blocking(gfp)) {
> > > > - page = dma_alloc_from_contiguous(dev, count, page_order,
> > > > - gfp & __GFP_NOWARN);
> > > > - if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > > - dma_release_from_contiguous(dev, page, count);
> > > > - page = NULL;
> > > > - }
> > > > - }
> > > > - if (!page)
> > > > - page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> > > > -
> > > > + page = dma_alloc_contiguous(dev, size, gfp);
> > > > if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > > - __free_pages(page, page_order);
> > > > + dma_free_contiguous(dev, page, size);
> > > > page = NULL;
> > > >
> > > > if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> > > > @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > > > if (PageHighMem(page)) {
> > > > /*
> > > > * Depending on the cma= arguments and per-arch setup
> > > > - * dma_alloc_from_contiguous could return highmem pages.
> > > > + * dma_alloc_contiguous could return highmem pages.
> > > > * Without remapping there is no way to return them here,
> > > > * so log an error and fail.
> > > > */
> > > > @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > > >
> > > > void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> > > > {
> > > > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > > -
> > > > - if (!dma_release_from_contiguous(dev, page, count))
> > > > - __free_pages(page, get_order(size));
> > > > + dma_free_contiguous(dev, page, size);
> > > > }
> > > >
> > > > void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> > > > --
> > > > 2.17.1
> > > >
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-07-25 23:42 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-24 4:06 [PATCH v3 0/2] Optimize dma_*_from_contiguous calls Nicolin Chen
2019-05-24 4:06 ` [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous() Nicolin Chen
2019-05-29 18:35 ` Nathan Chancellor
2019-05-29 22:48 ` Nicolin Chen
2019-07-25 16:06 ` Ezequiel Garcia
2019-07-25 16:50 ` Nicolin Chen
2019-07-25 17:31 ` Dafna Hirschfeld
2019-07-25 23:42 ` Nicolin Chen
2019-05-24 4:06 ` [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages Nicolin Chen
2019-05-24 16:16 ` Ira Weiny
2019-05-27 10:55 ` Nicolin Chen
2019-05-24 20:03 ` [PATCH v3 0/2] Optimize dma_*_from_contiguous calls dann frazier
2019-05-28 6:04 ` Christoph Hellwig
2019-05-29 23:06 ` Nicolin Chen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).