Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Sign in
Toggle navigation
A
alpha-mind
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Dr.李
alpha-mind
Commits
bc595833
Commit
bc595833
authored
Apr 25, 2017
by
Dr.李
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
modified to get better performance
parent
8ad5bdb1
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
32 additions
and
9 deletions
+32
-9
standardize.py
alphamind/data/standardize.py
+9
-2
winsorize.py
alphamind/data/winsorize.py
+23
-7
No files found.
alphamind/data/standardize.py
View file @
bc595833
...
@@ -15,8 +15,15 @@ def standardize(x: np.ndarray, groups: np.ndarray=None) -> np.ndarray:
...
@@ -15,8 +15,15 @@ def standardize(x: np.ndarray, groups: np.ndarray=None) -> np.ndarray:
df
=
pd
.
DataFrame
(
x
)
df
=
pd
.
DataFrame
(
x
)
gs
=
df
.
groupby
(
groups
)
gs
=
df
.
groupby
(
groups
)
mean_values
=
gs
.
transform
(
np
.
mean
)
.
values
mean_values
=
gs
.
mean
()
std_values
=
gs
.
transform
(
np
.
std
)
.
values
std_values
=
gs
.
std
()
.
values
value_index
=
np
.
searchsorted
(
mean_values
.
index
,
groups
)
mean_values
=
mean_values
.
values
mean_values
=
mean_values
[
value_index
]
std_values
=
std_values
[
value_index
]
return
(
x
-
mean_values
)
/
std_values
return
(
x
-
mean_values
)
/
std_values
else
:
else
:
return
(
x
-
x
.
mean
(
axis
=
0
))
/
x
.
std
(
axis
=
0
)
return
(
x
-
x
.
mean
(
axis
=
0
))
/
x
.
std
(
axis
=
0
)
...
...
alphamind/data/winsorize.py
View file @
bc595833
...
@@ -8,22 +8,38 @@ Created on 2017-4-25
...
@@ -8,22 +8,38 @@ Created on 2017-4-25
import
pandas
as
pd
import
pandas
as
pd
import
numpy
as
np
import
numpy
as
np
def
winsorize_normal
(
x
:
np
.
ndarray
,
num_stds
:
int
=
3
,
groups
:
np
.
ndarray
=
None
)
->
np
.
ndarray
:
def
winsorize_normal
(
x
:
np
.
ndarray
,
num_stds
:
int
=
3
,
groups
:
np
.
ndarray
=
None
)
->
np
.
ndarray
:
if
groups
is
not
None
:
if
groups
is
not
None
:
df
=
pd
.
DataFrame
(
x
)
df
=
pd
.
DataFrame
(
x
)
gs
=
df
.
groupby
(
groups
)
gs
=
df
.
groupby
(
groups
)
mean_values
=
gs
.
transform
(
np
.
mean
)
.
values
mean_values
=
gs
.
mean
()
std_values
=
gs
.
transform
(
np
.
std
)
.
values
std_values
=
gs
.
std
()
.
values
value_index
=
np
.
searchsorted
(
mean_values
.
index
,
groups
)
mean_values
=
mean_values
.
values
ubound
=
mean_values
+
num_stds
*
std_values
lbound
=
mean_values
-
num_stds
*
std_values
ubound
=
ubound
[
value_index
]
lbound
=
lbound
[
value_index
]
else
:
else
:
std_values
=
x
.
std
(
axis
=
0
)
std_values
=
x
.
std
(
axis
=
0
)
mean_values
=
x
.
mean
(
axis
=
0
)
mean_values
=
x
.
mean
(
axis
=
0
)
ubound
=
mean_values
+
num_stds
*
std_values
ubound
=
mean_values
+
num_stds
*
std_values
lbound
=
mean_values
-
num_stds
*
std_values
lbound
=
mean_values
-
num_stds
*
std_values
res
=
np
.
where
(
x
>
ubound
,
ubound
,
x
)
res
=
np
.
where
(
res
<
lbound
,
lbound
,
res
)
res
=
np
.
where
(
x
>
ubound
,
ubound
,
np
.
where
(
x
<
lbound
,
lbound
,
x
)
)
return
res
return
res
if
__name__
==
'__main__'
:
x
=
np
.
random
.
randn
(
3000
,
10
)
groups
=
np
.
random
.
randint
(
20
,
40
,
size
=
3000
)
for
_
in
range
(
1000
):
winsorize_normal
(
x
,
2
,
groups
)
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment