-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: resolve format for all situations with auto module detection on #53044
base: main
Are you sure you want to change the base?
module: resolve format for all situations with auto module detection on #53044
Conversation
Review requested:
|
if the solution is feasible and finds approval, the |
Also, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Tip
While my review shows my support, I am not a core collaborator, and this review has no power / place in the approval process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Thanks for getting to the bottom of this.
src/node_options.h
Outdated
@@ -110,7 +110,7 @@ class EnvironmentOptions : public Options { | |||
public: | |||
bool abort_on_uncaught_exception = false; | |||
std::vector<std::string> conditions; | |||
bool detect_module = false; | |||
bool detect_module = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to fix the format issue in one PR and then unflag the feature in the follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will split this into two PRs once the failing test with __filename
usage is sorted out
@@ -92,6 +92,12 @@ let typelessPackageJsonFilesWarnedAbout; | |||
function getFileProtocolModuleFormat(url, context = { __proto__: null }, ignoreErrors) { | |||
const { source } = context; | |||
const ext = extname(url); | |||
const deduceFormat = (fromSource, fromUrl) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should declare this at the top level. I'd also maybe name it determineFormat
or something more specific to explain the cases when it's used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also name it what the flag is named, something like detectModuleFormat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Will check your comments and split it into two PRs, will happen maybe beginning of next week. |
Lovely! |
1d7b4de
to
b7ec307
Compare
split done, this fixes the |
if (Buffer.isBuffer(realSource)) { | ||
// `containsModuleSyntax` requires source to be passed in as string | ||
realSource = realSource.toString(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove the following lines, all tests still pass. I suggest we remove them (unless we can have a test, but maybe it should be its own PR).
if (Buffer.isBuffer(realSource)) { | |
// `containsModuleSyntax` requires source to be passed in as string | |
realSource = realSource.toString(); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed a commit to fix that. This will probably be necessary but I agree, an own PR with relevant test is better
@@ -0,0 +1,13 @@ | |||
// Flags: --experimental-detect-module --import ./test/fixtures/es-module-loaders/builtin-named-exports.mjs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than make a new file for this, can it please get moved into the file that has the other tests with this fixture?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, made that with last commit
*/ | ||
function detectModuleFormat(source, url) { | ||
try { | ||
let realSource = source ?? readFileSync(url, 'utf8'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An earlier version of containsModuleSyntax
expected a file path, where it would load a file from disk and then parse it all in C++ land. Doing readFileSync
here means that we cross from JS to C++ to get the file contents and send that large string back across the boundary, only to then send it right back again from JS to C++ for containsModuleSyntax
to work with it.
A more performant approach is probably to update containsModuleSyntax
to allow for an undefined input in the source
parameter, and another optional parameter that can be a file path. Then in this situation where the JS side doesn’t already have the source, we can tell the C++ side to both read it from disk and analyze it, without multiple trips crossing the boundary. cc @anonrig
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reintroduced the reading from file in the native code implementation for the case where code
is not passed from the managed part. Slightly modified the logic as proposed in this comment.
The transpile case is currently the only one for which resolve
cannot determine the format correctly. For that we only have the needed information during the load phase, not during the resolve hook is running.
@@ -155,7 +155,7 @@ describe('--experimental-detect-module', { concurrency: !process.env.TEST_PARALL | |||
}); | |||
} | |||
|
|||
it('should not hint wrong format in resolve hook', async () => { | |||
it('should hint format correctly for extensionles modules resolve hook', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it('should hint format correctly for extensionles modules resolve hook', async () => { | |
it('should hint format correctly for the resolve hook for extensionless modules', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
* @param {string} source | ||
* @param {URL} url | ||
*/ | ||
function detectModuleFormat(source, url) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format
property returned from resolve
is optional. The addition of this helper seems like it’s becoming required; if we don’t know the format, we’re going to determine it, rather than waiting for load
to figure it out. This means potentially reading the source from disk twice, unless you preserve what you get from the first read, which detectModuleFormat
here currently doesn’t. Is there another way to fix the bug where we don’t necessarily read the source in resolve
if we weren’t already doing so?
Reading the source within resolve
is also problematic because that should be happening in the load
hook. If you read file sources here, any custom hooks the user has registered for load
won’t get applied before detection is run on whatever source is read here. So a TypeScript file would get misidentified as CommonJS because it can’t parse as ESM (or really, because it can’t parse at all) even if after the hooks are applied it’s transpiled into runnable ESM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I was thinking for a solution I had a similar chain of thoughts. This change makes resolve
format resolution more reliable for most of the cases but as you also mentioned, it still does not cover the case where the load
operation would have source modifications (e.g. some ts transpile step). Maybe a better solution would be to:
- try compile as CJS
- if possible => detected CJS
- if not possible, try to compile as ESM
- if possible => detected ESM
- otherwise
resolve
detects formatnull
(orundefined
)
This would at least cover all the current tests, it will have the additional performance penalty of reading the file but it will always resolve the format
correctly or not resolve it at all if it depends on load
operation chain.
I will set the PR to draft until we decide on next steps from here.
Is there another way to fix the bug where we don’t necessarily read the source in resolve if we weren’t already doing so?
not as far as I can think now. Because we have to detect the module type that was not defined by the module author and for that, we currently try to compile it. If we have no source, we cannot make any assumptions that are valid. And it is better imo to stay with null
or undefined
in this case instead of trying to announce commonjs
knowing that this might be wrong and changed later by the load
step. This cannot be reliably solved without looking at the code of the module. Or do you see some alternative to it?
Concerning the performance, the whole auto detection mechanism is anyway a performance penalty because we parse/compile the source for it now as well. And we generate the warning that the user should fix it by specifying the type here
afb82c8
to
b2f221f
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
The C++ linter is failing |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Co-authored-by: Antoine du Hamel <duhamelantoine1995@gmail.com>
triggered by #53015
solves: #53016
this should be a consistent fix to always resolve the module format correctly.
Enabling module detection by default made a few other tests need some adjustments because in this case they don't generate errors anymore. e.g.
test-esm-cjs-exports.js
instead of error becasue a .mjs imports a .js with ESM syntax it now successfully imports it and generates the warning that this should be fixed to avoid the performance penalty.Kindly please review and let me know what you think (if changes are necessary).
make test && make lint
=> green@nodejs/loaders